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1. Executive Summary 

The goal of this randomized controlled trial was to evaluate the effectiveness of a 
small-group intervention in fractions for fifth-graders who are performing below grade 
level in mathematics. The impact of the fractions intervention was assessed on fifth- 
grade at-risk students’ understanding of foundational fractions concepts and procedural 
competence with fractions. 

For the fractions intervention, lessons from the TransMath® curriculum (Level 2; 
Woodward & Stroh, 2015) were modified to create 52 thirty-five-minute lessons focused 
only on fourth- and fifth-grade level fractions content that could be used in a small-group 
setting. Lessons were structured so that each included review, demonstrations of 
concepts, student-led problem solving, and individual practice. 

The TransMath curriculum was selected as a platform to examine fifth-grade at- 
risk students’ understanding of foundational fractions concepts and procedural 
competence with fractions because it provides a balance among understanding of 
concepts, procedural competence, and problem solving. The curriculum emphasizes 
consistent use of number lines to build foundational fractions understanding and 
procedural competence, as well as discussions to enhance understanding and word 
problems to expand problem-solving abilities. 

In this rigorous large-scale RCT, a sample of 1,123 students from three school 
districts across two U.S. states were screened at the beginning of the school year using 
a fractions measure developed by the research team. Two hundred and five students 
who scored between the 15" and the 37" percentile on the screening measure and 
received parental consent to participate were randomly assigned to the intervention (n = 


102) and comparison (n = 103) conditions. 


Students in the fractions intervention condition received 35-minute tutoring 
sessions 3-4 times a week. Fraction instruction using the 52 TransMath lessons was 
provided by trained tutors. The comparison condition (n = 103) was business as usual 
instruction (i.e., core classroom fractions instruction, including intervention or support 
traditionally provided by the school). 

Results from the final analytic sample of 186 students (87 in intervention, 99 in 
comparison) showed that the intervention group significantly outperformed the 
comparison group on all outcome measures, which included an array of assessments 
used to measure both student understanding of foundational fractions concepts as well 
as procedural competence with grade-level fraction material (Hedges’ g = .66 to 1.08; p 
< .0001). 

2. Introduction 

According to most current thinking on interventions for struggling learners in 
mathematics (e.g., Fennell, 2011; Fuchs et al., 2013; Gersten et al., 2009; Gersten, 
Taylor, Keys, Rolfhus, & Newman-Gonchar, 2014; Gersten et al., 2015), successful 
mathematics interventions are primarily preventative: that is, as much as possible, they 
proactively teach grade-level content or missing foundational concepts in a small-group 
setting that allows for much more support than would be feasible in a class of 30 
students. Most argue that preventative (Tier 2) interventions should also build relevant 
foundational mathematical knowledge (both mathematics concepts and procedural 
proficiencies) essential for understanding grade-level content. In addition, intervention 
content should be based on best contemporary thinking about the mathematical content 
to teach. A source of confusion and occasional contention is the extent to which 


interventions for struggling learners should also include open-ended problem-solving 


activities and/or allow students to solve problems in more than one manner and discuss 
their reasons for choosing the strategy they used. Perhaps with the exception of one or 
two programs (e.g., Smith, Cobb, Farran, Cordray, & Munter, 2013), most studies of 
mathematics interventions do not include such procedures. 

In the interest of exploring approaches that are more open-ended and reflect 
current thinking in mathematics education (Carpenter, Fennema, Franke, Levi, & 
Empson, 2014; Clarke, Roche, Sullivan, & Cheeseman, 2014) about the importance of 
not only building understanding of mathematical ideas and proficiencies, but also 
developing students’ ability to solve problems and occasionally devise or invent their 
own strategies (as reflected for example in the Standards of Mathematical Practice 
MP1; National Governors Association Center for Best Practices [NGA Center], & 
Council of Chief State School Officers [CCSSO], 2010), the research team was 
interested in evaluating an intervention program that is consistent with these principles 
but still devotes a reasonable portion of each session to systematically building 
foundational mathematical skills and understandings. The research team selected the 
TransMath® fractions intervention (Woodward & Stroh, 2015) because it incorporates 
this balance among understanding of concepts, procedural competence, and problem 
solving. The program includes important fractions concepts and ideas articulated in the 
CCSS-M for Grades 4 and 5, as well as essential material in Grade 5 standards linking 
computational procedures to the underlying fractions concepts and ideas. Thus, the 
TransMath intervention fills the gaps in foundational fractions knowledge (e.g., 
understanding fractions as part of a set as well as part of a whole; locating fractions on 
a number line; equivalence) and proactively supports student learning of grade-level 


content in their regular classroom (i.e., fraction computation and linkage of 


computational methods to appropriate visual representations). TransMath also devotes 
significant time for students to solve problems and discuss their solutions. 

The goal of this efficacy study was to conduct a rigorous randomized controlled 
trial to evaluate the effectiveness of TransMath, a small-group intervention in fractions 
for fifth-graders who are performing well below grade level in mathematics. The impact 
of the TransMath fractions intervention was assessed on fifth-grade at-risk students’ (a) 
understanding of foundational fractions concepts, and (b) procedural competence with 
fractions. A wide range of measures was used to assess the impact, including two 
measures developed and tested by the IES Center for Improving Learning of Fractions: 
a fractions concepts measure aligned with Grade 4 CCSS-M and a test of fractions 
procedures aligned with Grades 4 and 5 CCSS-M. The impact was also assessed using 
a series of performance assessment tasks and number line estimation tasks. In 
addition, to better understand the learning environment that led to enhanced outcomes 
for this group of students, the CLASS observational system (Pianta, Hamre, & Mintz, 
2012a) and the survey on instructional practices were used to measure the nature of the 
instruction in both intervention and comparison conditions. 

3. Method 
3.1 Setting and Participants 

Student sample. One hundred and eighty-six fifth-grade students (89 boys, 97 
girls) from three school districts (two urban districts on the West Coast and one from an 
urban-adjacent district in the Southeast) participated in this randomized controlled trial 
study. Student baseline characteristics are summarized in Table 1. Chi-square analysis 
and t-tests revealed no statistically significant differences between intervention and 


comparison students on any of the demographic variables or the pretest measures. 


None of the 186 students had an IEP in mathematics. However, teachers 
confirmed that all students who met criteria had experienced persistent struggles 
learning mathematics, as one would expect given the screening and selection 
procedure (see the Screening criteria section, below). Thus, this would be considered a 
preventative intervention for students who did not master fourth-grade material in 
fractions, but whose problems were not so severe that they would be considered as 
students with mathematics learning disabilities or in need of one-on-one intensive 


intervention. 


Table 1 
Baseline Characteristics of the Student Analytic Sample 


Intervention Comparison 
(n = 87) (n = 99) 
Percent Percent xX? (df) p 
Gender 0.49 (1) 485 
Female 49.43 54.55 
Race/Ethnicity 3.41 (5) .636 
African American/Black 14.94 15.15 
Asian 55 6.06 
Hispanic/Latino 18.39 16.16 
White 42.53 34.34 
Multiracial 18.39 27.27 
Missing 0.00 1.01 
Free/Reduced Lunch 0.19 (2) .909 
Yes 57.47 54.55 
No 27.59 30.30 
Missing 14.94 13.15 
IEP/504 0.10 (2) .950 
Yes 6.90 8.08 
No 43.68 42.42 
Missing 49.43 49.49 
Mean Raw Mean Raw 
Pretest Measures Score (SD) Score (SD) t (df) p 
TUF-4 11.49 (1.44) 11.45 (1.55) 0.18 (184) .857 
WRAT-4 97.91 (10.46) 96.18 (11.21) 1.08 (184) .281 
Procedures (Add/Sub) 5.69 (4.03) 5.13 (4.12) 0.93 (184) 353 
NLE 0-1 72.50 (10.54) 74.11 (12.14) 0.96 (184) 339 


Note. Total analytic sample N = 186 students (I = 87, C = 99). WRAT-4 scores are standard 
scores. WRAT-4 standard score of 97.91 = 44.45" percentile. WRAT-4 standard score of 96.18 
= 39.95" percentile. 


Screening criteria. A sample of 1,123 students from three school districts was 
screened at the beginning of the school year using a fractions measure developed by 
the research team, Test for Understanding of Fractions, Fourth-Grade (TUF-4; 
Instructional Research Group [IRG], 2014). The goal was to include students who score 
between the 15!" and 35" percentile on a fractions measure. The lowest-35-percent 
cutoff has commonly been used in other studies examining the effectiveness of 


interventions for low-performing students (Fuchs et al., 2016; Powell & Fuchs, 2015). 


The intent was to provide a preventative Tier 2 intervention, and therefore we excluded 
students below the 15'" percentile as they might not benefit from a Tier 2 intervention 
alone and were likely to require a more intensive Tier 3 level of intervention. 

Three hundred and twenty-six students (102 from District 1, 147 from District 2, 
and 77 from District 3) fell between the 15'" and 37" percentile on a distribution for the 
TUF-4 measure that was computed across the entire sample from the three districts.' Of 
the 326 students who qualified, 121 (31 from District 1, 50 from District 2, and 40 from 
District 3) were excluded prior to random assignment. Of these 121 students, 58 
students were excluded by classroom teachers for a variety of reasons (e.g., conflicting 
schedules resulting from other services, not in need of intervention, no parental 
consent), and another 63 students were randomly excluded by the research team to 
make numbers more manageable at some school sites. This resulted in 205 students 
(71 from District 1, 97 from District 2, and 37 from District 3) who met criteria and had 
received parental consent to participate. These 205 students were randomly assigned 
to intervention (n = 102) and comparison (n = 103) conditions. 

Sample attrition. Students without posttests were counted toward attrition. A 
total of 17 students dropped out of the study for the following reasons: mobility (4 
students: 3 intervention, 1 comparison), scheduling conflicts (6 students: 3 intervention, 
3 comparison), parental consent rescinded (7 students from intervention). Of the 
remaining 188 students, two students were missing some pretest data. Thus, the final 
analytic sample (i.e., students with both pre- and posttest data) was 186 students (87 in 


intervention, 99 in comparison). Overall attrition was 9.27%, and differential attrition was 


'Of the 186 students in the analytic sample, 30.65% scored above the 32" percentile on the 
TUF-4 screening measure. 


10.82%. These are considered low by WWC Standards (U.S. Department of Education 
[U.S. ED], Institute of Education Sciences [IES], & What Works Clearinghouse [WWC], 
2017). 

Tutor sample. Ten tutors specifically hired by the research team participated in 
the study and provided intervention in mathematics. While none of the 10 tutors were 
employees of the school districts in which the study was conducted, they could be 
considered to be representative of those typically hired by districts to provide 
intervention within a response to intervention (Rtl) framework. The tutor demographic 
data are summarized in Table 2. All were female; half possessed a master’s degree, 
and half had prior experience tutoring in mathematics. Sixty percent were credentialed 
teachers. Average years of classroom teaching experience in elementary mathematics 
was 7.7 (SD = 8.92), and average experience teaching fifth-grade mathematics was 2.4 


years (SD = 4.86). 
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Table 2 
Baseline Characteristics of the Tutor Sample 


Education Level Percentage of Tutors 
Bachelors 50 
Masters 50 

Experience Tutoring in Math 50 

Years Teaching Elementary Math 
None 30 
2-6 30 
10-15 30 
28 10 

Years Teaching Fifth-Grade Math 
None 70 
1 10 
10+ 20 

Type of Teaching Credential@ 

K-6 10 
K-8 30 
Multiple Subject K-12 30 
None 40 


Note. Total number of tutors in the analytic sample = 10. 
@Percentages do not sum to 100 because one tutor has both K—8 and Multiple Subject K-12 
credentials. 


3.2 Study Design 

In this multi-site randomized controlled trial, students were randomly assigned, 
by teacher, to intervention and comparison conditions. Randomization eliminates 
selection bias, within-school assignment leads to equivalence on district and school 
characteristics between the two conditions, and random assignment of students by 
teacher leads to equivalence in quality of core classroom mathematics instruction. 

Core classroom mathematics instruction. Students in both intervention and 
comparison conditions received Tier 1 core classroom mathematics instruction from 
their mathematics teacher. My Math (McGraw-Hill Education, 2017b) was used as the 
core curriculum in District 1, California Math (McGraw-Hill Education, 2017a) was used 


in District 2, and GO Math! (Houghton Mifflin Company, 2018) was used in District 3. 


Additional information on the nature of fractions instruction in Tier 1 core mathematics 
classrooms was obtained via classroom teacher surveys. Findings from these surveys 
are presented in Section 4.5. 

Fractions intervention condition. Fraction instruction started in the last week of 
October 2016 and was completed by March 2017. Due to scheduling conflicts in four 
schools, the sessions did not end until April 2017. Based on local needs and schedules, 
the fractions intervention was implemented three times a week in nine tutoring groups 
and four times a week in 12 tutoring groups. At the start of the study, each tutoring 
group included 4—5 students (median group size = 5). 

Fifty-two selected lessons from the TransMath® curriculum (Level 2; Woodward & 
Stroh, 2015) were used for the intervention. The TransMath curriculum was selected 
because it attempts to provide a balance between teaching mathematical ideas and 
teaching procedural proficiency, as well as attempting to explicitly link the two. Number 
lines were used consistently to build foundational understanding in both the part—whole, 
but especially the more difficult measurement interpretations of fractions, as well as 
understanding of the four operations when applied to fractions. Word problems were 
also emphasized to enhance understanding and build problem-solving abilities. 

The TransMath curriculum was designed for use in large-group settings such as 
lower track or double-dose classes for students who have struggled with mathematics. It 
is geared toward Grades 4-8, and it includes the full range of the mathematics content 
covered in those grade levels in alignment with the Common Core State Standards. 
Each lesson is intended to be 55-60 minutes in duration. 

For the fractions intervention, lessons from TransMath were reorganized by two 


members of the research team to create 52 thirty-five-minute lessons focused only on 


fractions that could be used in a small-group setting. The team identified the fractions 
material from TransMath (Level 2) that addressed both fourth- and fifth-grade-level 
content, so that lessons or lesson segments devoted to measurement, geometry, and 
whole number operations were eliminated. Lessons were sometimes restructured so 
that each lesson included review, demonstrations of concepts, student-led problem 
solving, and individual practice in a 35-minute lesson. Lesson structure consisted of the 
following segments: 

1. Review: Practice previous day’s skills or prerequisite skills for the day (5 

minutes), 

2. Strategic Explicit Instruction: Instruction in key concepts along with checks for 

understanding (10 minutes), 

3. Guided Practice: Solving problems as a group or with a partner (10 minutes), 

and 

4. Student Explanations: Students solve problems independently and provide 

explanations for their strategies to the small group (10 minutes). 

Through out the lesson, students were provided with specific corrective feedback 
to decrease the occurance of misconceptions and subsequent mathematical errors. In 
addition, the vocabulary students needed to understand the lesson content was listed at 
the beginning of each lesson and specifically taught during the lesson. Vocabulary word 
walls were maintained as new vocabulary was introduced. 

Accurate student explanations was a goal of the intervention and, therefore, an 
integral part of the TransMath curriculum. To that end, students were provided with 
supports to help build their capacity in providing mathematically correct explanations of 


their solution methods. For example, a multi-step prompt card that listed four steps for 
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writing explanations was developed. Steps were categorized as “thinking” steps 
(e.g.,"What’s the problem asking me?” and “What did | do to solve it?”) and “writing” 
steps (e.g., “Write all the steps using mathematical vocabulary.” and “Write why the 
answer makes sense.”) To further assist students in writing explanations that include 
mathematical language, each student was provided with a vocabulary card which listed 
relevant math terminology learned during the intervention (e.g.,denominator, equivalent, 
etc.). Thus, the multi-step prompt card and the vocabulary card were used during the 
fractions intervention to support students as they practiced explaining their thinking. 

In addition, a prompt card with the acronmym LAPS was used to help students 
when they added mixed number fractions. The four steps prompted student to: Look 
(Are common denominators needed? Is grouping needed?), Alter (change 
denominators; regroup), Perform (add or subtract) and Simplify (reduce; regroup). 

Scope of the fractions content in the intervention. The intervention lessons 
covered the fractions content identified in contemporary mathematics standards for 
Grades 4 and 5 (e.g., Common Core State Standards in Mathematics [CCSS-M]; NGA 
Center & CCSSO, 2010; California Common Core State Standards for Mathematics 
[California Department of Education, 2013]). In general, fourth-grade standards focus on 
foundational fractions concepts such as equivalence and ordering (CCSS-M 4.NF.A.1) 
and understanding unit fractions (CCSS-M 4.NF.B.3). They also address addition and 
subtraction with like denominators (e.g., CCSS-M 4.NF.B.3.D) and multiplication of a 
fraction with a whole number (CCSS-M 4.NF.B.4). Fifth-grade standards extend fraction 
learning to addition and subtraction with unlike denominators (CCSS-M 5.NF.A.1), 
multiplication of a fraction by a fraction (CCSS-M 5.NF.B.4), and division of a whole 


number by a unit fraction (CCSS-M 5.NF.B.7). 
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Lessons 1-18 addressed foundational understanding. These lessons 
emphasized understanding what a fraction is, magnitude of fractions, equivalent 
fractions, developing understanding by comparing two fractions, ordering fractions from 
least to greatest, and estimating fraction placement on the number line. Lessons 19-28 
focused on addition and subtraction of fractions, while Lessons 29-42 addressed 
multiplication and division of fractions. For instance, the lessons focused on the 
underlying concepts and procedures for (a) addition and subtraction of fractions with like 
denominators (> + 5) and unlike denominators (5 — =), (b) multiplication of a whole 
number times a fraction (2 x =) and a fraction times a fraction (5 x =), and (c) division of a 
whole number divided by a unit fraction (2 + 2) and a unit fraction divided by whole 
number (= + 2). The lessons also focused on critical fractions concepts related to 
computational procedures. Lessons explored, for instance, why fractions with unlike 
denominators cannot be added or subtracted before the problem is modified to include 
like denominators, or how the multiplication of two fractions involves finding a fraction of 
a fraction (e.g., > x = is the same as = of =). The final set of lessons—Lessons 43-52— 
included material on adding and subtracting mixed numbers. Thus, lessons focused on 
fractions less than 1, fractions greater than or equal to 1, and mixed numbers. 
Throughout, other requisite skills with whole numbers were included (e.g., multiples and 
factors) to support solving fractions computation problems. 

In most instances, the TransMath curriculum limited fractions that students 
encountered to those with familiar denominators (e.g., =, =, = rather than +, =, =) that 
were easier to manipulate and understand. This helped students focus on the concepts 


being taught instead of getting distracted with intricate calculations with numbers that 


are rarely encountered. 

Use of concrete and semi-abstract visual representations. While number 
lines were used as the central visual representation, Cuisenaire Rods® (a concrete 
manipulative) and area models were also used to visually represent fractions and serve 
as a means for demonstrating important concepts related to fractions. Cuisenaire Rods 
are linear, 3D, color-coded, and of various sizes, and can be used interchangeably to 
represent one whole or parts of a whole, thus allowing for hands-on exploration of 
fractions principles. The representations were included strategically, as the ultimate goal 
was for students to solve problems without needing concrete or other visual 
representations. Cuisenaire Rods were particularly useful in scaffolding student learning 
of fractions concepts and procedures. Area models were used occasionally to clarify 
fractions concepts (e.g., part—whole relationship, multiplication). 

Refinements based on the pilot study. The fractions intervention was tested 
via a small-scale pilot study and revised based on the formative data. For additional 
information on the pilot study and revisions made to the fractions intervention, see 
Schumacher et al. (2018). 

Comparison condition. The comparison condition was business as usual 
instruction. As not all the schools in the study were set up to provide formally structured 
Tier 2 mathematics intervention, we anticipated that only some of the students in the 
comparison condition would receive some form of intervention, with the rest receiving 
core classroom instruction, be it in mathematics, another academic subject (e.g., 
science), or a non-academic class like P.E. To determine how the class time was being 
spent by the comparison students during the 35-minute intervention block when 


intervention students were receiving the fractions intervention, surveys were 
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administered to classroom teachers in fall and spring. Data from these surveys are 
discussed in Section 4.5. 
3.3 Training of Tutors 

Tutors attended a two-day training conducted by the research team. The first day 
of training began with a brief discussion of the purpose of the research project, followed 
by an overview of the topics covered in the 52 TransMath lessons. It concluded with 
demonstration and practice with concrete representations (Cuisenaire Rods, which were 
new to many of the tutors), as well as visual representations (number lines). The second 
day focused on techniques for facilitating verbal and written explanations (many 
developed during the pilot study; Schumacher et al., 2018), demonstrations of correct 
fraction procedures (e.g., how to regroup fractions, distinguishing between least 
common multiple and greatest common factor), and identifying and solving word 
problems that required the four operations. During the training, tutors were told that 
fidelity of implementation would be assessed. Tutors were issued a digital recording 
device and taught how to record their TransMath lessons and upload them onto a 
secure, password-protected website. 

Ongoing support and feedback to tutors. All tutors received coaching from the 
research staff periodically throughout the course of the study to review lesson goals, 
address questions and concerns, and discuss their experiences implementing the 
curriculum (e.g., which aspects were difficult, which aspects were easy). Coaching 
phone calls were held every week for the first three weeks of the intervention with the 
entire group of tutors broken into two groups. A member of the research team listened 
to the audio recording of each tutor’s third or fourth lesson and provided them with 


immediate feedback. If the tutor was deemed to be “weak,” additional checks of the 
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audio recordings were conducted, and feedback was provided. As the intervention 
progressed, coaching calls were held every three weeks with the whole group and on 
an as-needed basis with individual tutors. 
3.4 Measures 

Student measures. The following student measures were used in the study. 

1. Test for Understanding of Fractions, Fourth-Grade (TUF-4). The TUF-4 (IRG, 
2014) was administered at pretest. The measure was used (a) to screen students for 
eligibility, (6) as a potential covariate, (c) as a means to help describe pretest-posttest 
growth in fractions knowledge for the two samples, and (d) as one of the major posttest 
measures. The measure consists of 26 multiple-choice fraction items based on third- 
and primarily fourth-grade CCSS in mathematics. The items were selected from publicly 
available NAEP assessments for Grades 4 and 8 (National Center for Education 
Statistics, 2009), //ustrative Mathematics (Illustrative Mathematics, 2013), and the 
various measures used by researchers from the IES fraction item bank at the Center for 
Improving Learning of Fractions (e.g., Fuchs, et al., 2013; Jordan et al., 2013). This item 
pool was then subjected to review by two research mathematicians involved in 
mathematics education, Kristin Umland and Jim Lewis, who scrutinized the accuracy of 
the mathematical language used and the extent to which understanding of key 
mathematical ideas and concepts was assessed. They also assisted in ensuring that all 
relevant Grade 4 (and to some extent Grade 3) CCSS-M addressing fractions were 
addressed. The measure demonstrated a coefficient alpha reliability of .80 in the current 


study and .86 in a previous, larger-scale study.” Item difficulty estimates range from 


?The Cronbach’s alpha based on sample of 5,005 Grade 4 students is 0.86 (Jayanthi, Gersten, 
Taylor, Smolkowski, & Dimino, 2017). 
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-14.26 to 3.82. Overall, 11 of the 26 items have difficulty estimates below zero. 
Discrimination estimates for the 26 items range from —2.23 to 0.76. 

2. Wide Range Achievement Test 4 (WRAT-4): Math Computation subtest. The 
40-item Math Computation subtest is a measure of general mathematics achievement. 
It was administered at pretest only as a potential covariate, because the TUF-4, by 
design, would have a restricted range, since it served as a screener. The WRAT-4 is a 
brief overview of general mathematics proficiency. Median reliabilities range from 0.83 
to 0.87 (Wilkinson & Robertson, 2006). 

3. Test for Understanding of Fractions, Fifth-Grade (TUF-5). Since the measure 
is aligned with Grade 5 CCSS-M related to fractions, TUF-5 (IRG, 2015) was used at 
posttest only. The items were also reviewed by mathematics educators (Francis Fennell 
and Karen Karp) to ensure comprehensiveness and alignment with CCSS-M standards 
and precision of mathematical language. It includes 18 items derived from NAEP and 
PARCC fourth- and fifth-grade assessments (Pearson Education, 2015). The internal 
consistency for the measure is .76. Item difficulty estimates range from -9.87 to 3.39. 
Overall, 5 of the 18 items have difficulty estimates below zero. Discrimination estimates 
for the 18 items range from -0.10 to 3.52. 

4. Test of Fraction Procedures. This 24-item measure, used as a posttest, was 
adapted from the measure developed by Jordan et al. (2013). It assesses students’ skill 
with the four arithmetic operations involving fractions, as articulated in the Grade 5 
CCSS-M standards. It is an adapted version of the measure. The measure has an 
internal consistency of .89. Item difficulty estimates range from -3.40 to 1.86. Overall, 
11 of the 24 items have difficulty estimates below zero. Discrimination estimates for the 


18 items range from 0.35 to 4.29. A shortened 12-item version of this test, containing 
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only the addition and subtraction items, was used as a pretest. The internal consistency 
for the pretest measure is .86. 

5. Number Line Estimation (NLE). Two measures, NLE 0-1 (nine items; Fuchs et 
al., 2016) and NLE 0-2 (19 items; Fuchs et al., 2016) were used to assess students’ 
ability to place fractions on number lines with endpoints from 0 to 1 and 0 to 2, 
respectively. NLE 0-7 was used as both a pretest and a posttest, whereas the more 
difficult VLE 0-2 was used only as a posttest. Test-retest reliability for these measures 
is .80 (Fuchs et al., 2016). 

6. Curriculum-aliqned assessment. This 25-item measure is aligned with the 
intervention and includes multiple choice and constructed response items that assess 
student proficiency in the concepts and procedures covered by the intervention. Items 
on this test were selected from the end-of-unit tests and performance assessment tasks 
present in the TransMath curriculum. Cronbach's alpha for the 25-item scale is .78. Item 
difficulty estimates range from —2.55 to 3.65. Overall, 16 of the 25 items have difficulty 
estimates below zero. Discrimination estimates for the 25 items range from 0.18 to 5.26. 

7. Performance-based assessments. The assessment battery also included five 
performance-based assessments that were administered throughout the course of the 
intervention to all participating intervention students and to a randomly selected sub- 
sample of 34 comparison students. Each assessment included one constructed 
response type item (Chval, Lannin, Jones, & Dougherty, 2013) and was a topic covered 
in the intervention. See Table 3 for a description of the items from each performance 
assessment. Students were asked not only to solve the problem, but also to provide a 
written explanation for their response (i.e., their rationale). 


The research team developed a unique rubric to score each problem. The rubrics 
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were developed in an iterative fashion by initially scoring a small set of student 
responses. Two members of the research team then used the final rubrics to score all 
the student responses. Each rubric assessed the accuracy of the answer and the quality 
of the written explanation. Full points were awarded for a correct and simplified answer, 
and partial points were given for a correct but unsimplified answer. The students’ written 
explanations were graded according to a set of elements that were unique to each 
problem. The research team examined each problem and considered what critical 
understandings the students had to have for solving the problem (e.g., rationale for 
solving the problem, which operation to use and why, and how they would execute the 
operation). For example, the scoring rubric for the written explanations for the fifth 
performance assessment (i.e., Manny is training for a marathon. He ran 102 miles on 
Saturday. On Sunday, he ran 7= miles. How many more miles did Manny run on 
Saturday than Sunday?) allowed for a total of five points, one point for each of the 
following elements: (a) student writes about changing 2 to = or about common 
denominators, (b) student presents a rationale for subtracting, (c) student mentions that 
you cannot subtract without regrouping, (d) student explains the regrouping process, 
and (e) student mentions simplifying the answer. In addition, the scoring rubrics for 
performance assessments 3, 4, and 5 also awarded an additional point to students who 
identified and applied the correct operation to solve the problem. 

Given the moderate level of inference necessary for scoring the assessments, 
the inter-rater agreement between the two scorers was defined as any scores that fell 
within one point of each other (i.e., plus or minus one). The mean inter-rater reliability 


was 99.42% for correct answer, 96.54% for appropriate written explanation, and 100% 
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for correct operation. 


Table 3 


Performance-based Assessments 


Assessment 


Problem 


1 


Mark where 2 is located on the number line below. How did you know where 
to place 2 on the number line? Explain your thinking. 


Order the fractions from least to greatest on the blank spaces below. Then, 
mark and label these fractions on the number line: > = = 2. How did you 
know where to place each fraction? Explain your thinking. 

Use a number line, draw pictures, or use numbers to solve the word problem. 
On Saturday, Jessie walked = of a mile to the park. Then, from the park she 
walked = of a mile to her friend Emily’s house. Next, Jessie and Emily walked 
— of a mile together. How far did Jessie walk on Saturday? Explain your 
thinking. 

Solve the word problem. Use pictures, number lines, or numbers to show your 
problem solving. Bella likes to build with Legos. In her set of Legos, = are red. 
Bella used = of her red Legos to build a fire truck. What fraction of her total set 
of Legos did she use to build the fire truck? Explain your thinking. 

Solve the word problem. Use pictures, number lines, or numbers to show your 
work. Simplify your answer by putting into lowest terms. Manny is training for 
a marathon. He ran 102 miles on Saturday. On Sunday, he ran 7= miles. How 
many more miles did Manny run on Saturday than Sunday? Explain your 
thinking. 


Observations of tutoring groups. The Classroom Assessment Scoring System 


(CLASS): Upper Elementary (Pianta et al., 2012a) was used to assess the nature of 


intervention instruction in each tutoring group. The CLASS examines the quality of 


instruction by capturing the nature of the interactions between students and tutors. The 


CLASS consists of three domains—Emotional Support, Classroom Organization, and 


Instructional Support—and 12 dimensions that assess the quality of tutors’ instructional 


and social interactions. These instructional dimensions are rated on a 7-point Likert 


Scale. See Table 4 for descriptions of each CLASS dimension. 


Each tutoring group was observed during one intervention session by a trained 


and certified CLASS observer. Detailed information on the CLASS ratings across the 21 
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tutoring groups is presented in Section 4.2. 

Four tutoring groups were observed by two observers to establish inter-rater 
reliability, which was calculated using the percent agreement formula. The raters were 
considered to be in agreement if their ratings were within one point of each other on the 


7-point Likert Scale. Inter-rater reliability within one point difference was 100%. 
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Table 4 


Upper Elementary CLASS Domains and Dimensions 


Domain 


Dimension 


Indicators 


Emotional Support 


Positive Climate 


Relationships; positive affect; 
positive communications; respect 


Teacher Sensitivity 


Awareness; responsiveness to 
academic and social/emotional 
needs and cues; effectiveness in 
addressing problems; student 
comfort 


Regard for Student Perspectives 


Flexibility and student focus; 
connections to current life; support 
for autonomy and leadership; 
meaningful peer interactions 


Classroom Organization 


Behavior Management 


Clear expectations; proactive; 
effective redirection of 
misbehavior; student behavior 


Productivity 


Maximizing learning time; routines; 
transitions; preparation 


Negative Climate 


Negative affect; punitive control; 
disrespect 


Instructional Support 


Instructional Learning Formats 


Learning targets/organization; 
variety of modalities, strategies, 
and materials; active facilitation; 
effective engagement 


Content Understanding 


Depth of understanding; 
communication of concepts and 
procedures; background 
knowledge and misconceptions; 
transmission of content knowledge 
and procedures; opportunity for 
practice of procedures and skills 


Analysis and Inquiry 


Facilitation of higher-order 
thinking; opportunities for novel 
application; metacognition 


Quality of Feedback 


Feedback loops; scaffolding; 
building on student responses; 
encouragement and affirmation 


Instructional Dialogue 


Cumulative content-driven 
exchanges; distributed talk; 
facilitation strategies 


Student Engagement 


Active engagement 
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Surveys. The following surveys were administered. 

Classroom teacher survey of instructional practices. Classroom teachers 
were asked to complete a survey in fall and spring to determine when instruction in 
fractions was provided to intervention and comparison students in their core 
mathematics classrooms. Teachers were also asked several questions about the nature 
of their fractions instruction (e.g., fraction content that was covered, type of 
representations used, opportunities for students to explain their understanding). In 
addition, teachers were probed about the activities of the comparison group students 
during the 35 minutes when the intervention students were receiving the TransMath 
fractions intervention. Responses from the classroom teacher survey are summarized in 
Section 4.5. 

Student, tutor, and classroom teacher appraisal surveys. Students, tutors, 
and classroom teachers were given appraisal surveys at the end of the intervention in 
spring to solicit their feedback on the perceived benefits of the fractions intervention. 
Findings from these appraisal surveys are presented in Section 4.5. 

3.5 Fidelity of Implementation 

The research team selected a purposeful sample of eight TransMath lessons to 
assess fidelity (Lessons 4, 11, 20, 27, 31, 35, 43, and 47). These lessons were selected 
as they cover all phases of the instructional period and address critical intervention 
topics and instructional approaches. Fidelity was assessed on lessons that included 
instruction in foundational fractions concepts (e.g., equivalence and magnitude), the 
four operations, addition and subtraction with regrouping, written explanations, and 
word problems. Fidelity lessons also sampled instruction that relied on manipulatives 


and the number line representation. All eight sessions were scored for both fidelity to 
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the procedures and activities required by the intervention curriculum and perceived 
implementation quality. 

All 52 TransMath lessons were audio recorded, and the tutors were not told 
which audio recordings would be checked for fidelity. For one tutoring group, fidelity was 
assessed through classroom observations (rather than by listening to audio recordings) 
as the parent/guardian of one participating student in the group declined permission to 
audio record. 

Procedural fidelity was assessed using checklists developed by the research 
team based on the lessons’ curricular content. The checklists included 55 procedures 
and activities, on average (range = 34 to 82 items). Each procedure/activity on the 
checklist was marked as observed or not observed. Fidelity was calculated as 
percentage of procedures implemented (number of procedures observed + total number 
of procedures [observed and not observed] x 100). 

Quality of implementation was assessed by rating the tutors on “qualities” 
generally associated with mathematics instruction: tutors pacing of the lesson, clarity 
and mathematical correctness of language, providing specific math-oriented praise, 
ability to enhance student explanations, and ability to maintain a positive rapport with 
students. The fidelity raters (members of the research team) also rated the 
implementation quality based on their perception of the students’ grasp of the content 
during the intervention session. They also provided an overall rating. All seven items 
were rated on a 5-point Likert scale, with 1 = low and 5 = high. See Section 4.4 for 
fidelity of implementation data. 

To determine inter-rater reliability of the fidelity data, 22 randomly selected audio 


recordings (13% of the total recordings) were coded by two raters. Given the moderate 
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level of inference necessary for coding the lessons, the agreement was defined as any 
ratings that fell within one point of each other. The mean inter-rater reliability was 
81.57% for the procedural items and 75.97% for the quality of instruction items. 

3.6 Data Analysis 

Main impact analysis. For all outcomes, we conducted an analysis of 
covariance (ANCOVA) for partially nested data described by Bauer, Sterba, and Hallfors 
(2008) and Baldwin, Bauer, Stice, and Rohde (2011). The study design called for the 
randomization of individual students in each class to either the TransMath intervention 
condition, with students nested within small groups, or a non-nested comparison 
condition. Thus, TransMath groups, but not the unclustered comparison group, required 
consideration of a group-level variance at the tutoring group level. This required an 
analytic model that accounted for the potential heterogeneity of variance across 
conditions (Roberts & Roberts, 2005). We used Satterthwaite approximation to 
determine degrees of freedom. 

Because the residual variances may have differed between conditions, we tested 
the assumption of homoscedasticity of residuals and reported results of the most 
appropriate model for each outcome measure. We tested whether the homoscedastic 
and heteroscedastic models could be assumed equivalent with a likelihood ratio test 
and reported the simpler model if we were able to accept the equivalence of the two 
models. Because this tests the simpler model’s noninferiority when compared to the 
more complex model, we reversed the null and alternative hypotheses and, hence, the 
Type | and Type II error rates, a and B, common among equivalence or noninferiority 
trials (Dasgupta, Lawson, & Wilson, 2010). For this reason, and the poor statistical 


power associated with tests of variance structures (Kromrey & Dickinson, 1996), we set 
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a = .20 as our Type | error rate and reported the more complex model unless we were 
relatively certain that the two were equivalent. 

Sensitivity analyses. We conducted several variations on the analysis to test 
the sensitivity of the main-effect results to the analytic method. Due to the conceptual 
and technical flaws noted by some with covariate adjustment (e.g., Spector & Brannick, 
2011; Willett, 1988), we examined condition differences on net gains with a mixed 
model Time x Condition analysis (Murray, 1998), modified to account for students 
partially nested within small groups (see Clarke et al., 2016, for details). This analysis 
included only a subset of the three measures with both pretest and posttest data. We 
also tested models without covariates, recommended by Simmons, Nelson, and 
Simonsohn (2011). We next tested a set of mixed-models that nested students and their 
small groups within teachers or schools. Finally, Baldwin, Murray, and Shadish (2005) 
and Roberts and Roberts (2005) have argued that for individuals nested within an 
interventionist, the analysis should treat the interventionist as the unit of analysis. 
Hence, we also examined the effects for students nested within tutor rather than small 
group, as some tutors taught multiple small groups. 

Model estimation. We fit the aforementioned statistical models to our data using 
SAS PROC MIXED version 14.2 (SAS Institute, 2016) and restricted maximum 
likelinood estimation. The models assume independent and normally distributed 
residuals. We addressed the first assumption (van Belle, 2008) by explicitly modeling 
the multilevel nature of the data. Multilevel regression methods have also been shown 
to be quite robust to violations of normality (e.g., Hannan & Murray, 1996). 

Effect sizes and multiple tests. To interpret results, we computed the Hedges’ 


g effect size for each fixed effect of condition as recommended by the What Works 
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Clearinghouse (U.S. Department of Education et al., 2017). We also corrected for 
multiple tests with the Benjamini-Hochberg procedure (Benjamini & Hochberg, 1995) 
and reported the original p-values as well as the Benjamini-Hochberg corrected p- 
values (PBHc) for each outcome. We adjusted p-values separately for the seven tests of 
main effects. 

Moderation analysis. To determine whether students responded differentially to 
the TransMath intervention based on initial math achievement on WRAT-4, TUF-4, Test 
of Fraction Procedures (addition and subtraction items only), or NLE 0-7, we expanded 
the ANCOVA to include each moderator and its interaction with condition. We also 
examined whether the impacts varied by district, gender, and free and reduced lunch 
status. The tests for equivalence between homoscedastic and heteroscedastic 
variances were conducted in the same manner as for main effects, described above. 

Mediation analysis. For this analysis we examined the mediating effect of 
improved number line estimating skills on fraction understanding. We focused on the 
Number Line Estimation measure as a potential mediator as it is a widely used and 
accepted measure of students’ grasp of the measurement interpretation of fractions 
(e.g., Siegler & Pyke, 2013; Siegler, Thompson, & Schneider, 2011) and was shown to 
mediate fractions achievement (Fuchs et al., 2016). We examined the mediating effect 
on TUF-4, since it was the only measure that was given as a pre- and post-test, thus 
allowing the calculation of a gain score. The use of gain scores decreases the biases 
associated with mediation tests discussed by Maxwell and colleagues (Mitchell & 
Maxwell, 2013). The primary concern is that mediation models without repeated 
assessments of the mediator and outcome measures could produce biased results 


(Judd & Kenny, 1981). Our approach using gains has fewer problems with biases than 
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the standard model (von Soest & Hagtvet, 2011), but it can still only document 
correlations. Thus, these analyses are correlational, so neither temporal nor causal 
inferences are warranted. 

To offer evidence in support of mediation, we tested whether the effect of (X) 
condition on (Y) gains on the TUF-4 may have been explained by the indirect effect 
through (M) gains on the NLE 0-1. The indirect-effects model roughly follows Baron and 
Kenny's (1986) causal steps approach, which has considerable intuitive appeal but will, 
with cross-sectional and even sequential data, produce biased estimates (e.g., Judd & 
Kenny, 1981; Mitchell & Maxwell, 2013). As a potential remedy, von Soest and Hagtvet 
(2011) demonstrated an approach to mediation with growth curve models. With only two 
assessments, however, we estimated pre-post gains over time. The model does not use 
latent variables, so it does not benefit from increased power. Individual growth over time 
has been shown to be a "natural extension of the observed difference score" (Willett, 
1988, p. 414), so the methods of von Soest and Hagtvet (2011), when applied to gains, 
should test whether the data are consistent with the hypothesis of mediation. 

The indirect-effects model was estimated in Mplus (Muthén & Muthén, 1998- 
2017), and to address the nonnormaility of the sampling distribution for the test of the 
indirect path from condition through NLE 0-7 to TUF-4, we used bias-corrected 
bootstrapped confidence intervals (Preacher & Hayes, 2008) based on 5,000 samples. 
This analysis did not nest intervention students within small groups or include 
covariates. Our sensitivity analyses of impact estimates produced similar intervention 


effects and standard errors under these conditions. 
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4. Results 
4.1 Impact of the Fractions Intervention on Student Fractions Achievement 

Descriptive data. Unadjusted pretest and posttest means and standard 
deviations are listed in Table 5. Note that the sample of students selected to participate 
in the study was between the 15" and the 37" percentile on the TUF-4 measure based 
on a distribution that was computed across the entire tested sample (1,123 students) 
from the three districts. Thus, while the selected students were performing well below 
average in the area of fractions, they were performing at a higher level (between the 
40'" and the 44" percentile) but still below average on WRAT-4, a nationally normed 
test of general mathematics achievement. 

The mean percent correct on TUF-5, an 18-item contemporary grade-level 
measure of fractions achievement aligned with Grade 5 CCSS standards, was 46.45 for 
students who participated in the intervention and 32.55 for students who received 
business as usual. The mean percent correct on the 24-item Test of Fraction 
Procedures assessing fifth-grade level fraction computation in addition, subtraction, 
multiplication, and division was 55.04 for intervention students and 32.09 for 


comparison students. 
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Table 5 
Descriptive Data on Student Pre- and Posttests 


Intervention Comparison 
(n= 87) (n= 99) 
Unadjusted Unadjusted 
Mean Test Mean Mean Test Mean 
Score? Percent Score? Percent 
Pretest Measure (SD) (SD) (SD) (SD) 
TUF-4 (Screener) 11.49 44.21 11.45 44.06 
(1.44) (5.53) (1.55) (5.95) 
WRAT-4 97.9> n/a 96.18° n/a 
(10.46) (11.21) 
Procedures (Add/Sub) 5.69 23.71 5.13 21.38 
(4.03) (16.80) (4.12) (17.18) 
NLE 0-1 n/a 72.50 n/a 74.11 
(10.54) (12.14) 
Posttest Measures 
TUF-4 16.84 64.77 13.46 51.79 
(4.16) (16.00) (4.13) (15.87) 
TUF-5 8.36 46.45 5.86 32.55 
(3.92) (21.78) (3.25) (18.05) 
Test of Fraction Procedures 26.42 55.04 15.40 32.09 
(10.79) (22.48) (9.46) (19.72) 
Curriculum-aligned Measure 16.16 64.64 11.77 47.07 
(4.66) (18.64) (3.55) (14.19) 
NLE 0-1 n/a 90.65 n/a 80.03 
(7.53) (11.67) 
NLE 0-2 n/a 87.24 n/a 80.82 
(8.33) (8.15) 


Note. Total sample size N = 186 students (I = 87, C = 99). Sample size for TUF-5, Test of 
Fraction Procedures (Full), NLE 0-1, and NLE 0-2 posttests is 185 students (| = 86, C = 99). 
2aWRAT-4 scores are standard scores; the rest are raw test scores. °WRAT-4 standard score of 
97.91 = 44.45" percentile. °WRAT-4 standard score of 96.18 = 39.95" percentile. n/a = not 
available. 


Impact on student posttests. The results of the partially nested analyses that 
compared intervention students in small groups to unclustered comparison students at 
posttest are presented in Table 6. The Hedges’ g values range from .66 to 1.08, and p- 
values were all less than .0001, even after the Benjamini-Hochberg correction. 

The findings provide an estimate of the fractions intervention’s impact on 
students’ understanding of foundational fractions concepts and procedural 


competence with fractions. Two fraction measures—7UF-4 and TUF-5—were used to 
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assess impacts on students’ proficiency with foundational and grade-level fractions 
content. The effect size (Hedges’ g) on the TUF-4 measure was .78 and was statistically 
significant (jp < .0001). The impact was also statistically significant for TUF-5 (g = .66, p 
< .0001). 

The Test of Fraction Procedures was used to measure student proficiency with 
grade-level computation in the areas of addition, subtraction, multiplication, and division 
of fractions. Analyses revealed that students who received the TransMath fractions 
intervention significantly outperformed students from the comparison group (g = 1.07; p 
< .0001). The Curriculum-aligned Measure was used to assess the intervention’s impact 
on students’ understanding of the fraction content covered in the TransMath 
intervention. The impact was statistically significant (jp < .0001), with an effect size (g) of 
1.06. 

The NLE 0-7 and NLE 0-2 measures were used to determine if there were 
additional impacts on students’ estimation of relevant fraction magnitude. Students in 
the intervention condition scored significantly better than those in the comparison group 
on the NLE 0-1 (g = 1.08; p< .0001) and NLE 0-2 (g = .80; p< .0001). 

In Table 6, the top set of rows presents the fixed effects, followed by the 
variances in the next set of rows, with details about the tests of condition in the third set 
of rows. The bottom two rows of the table show the likelihood ratio test results that 
compared homoscedastic residuals with heteroscedastic residuals, and the tables 
report a different number of variances depending on the results. The data fit the 
homoscedastic model that assumed an equivalent residual variance across conditions 
for the Curriculum-aligned Measure, TUF-4, TUF-5, and Procedures measures. The 


data fit the heteroscedastic model (p < .20) for the Procedures (Add/Sub), NLE 0-7, and 
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NLE 0-2 measures. Although the variance structures differed between these models, 
the condition effect estimates and their statistical significance values were very similar 
for the heteroscedastic and homoscedastic models. All analytic models included the 
WRAT-4 and NLE 0-1 pretest measures as covariates, which were statistically 


significant in each model. 


See Table 7 for adjusted posttest means and unadjusted standard deviations. 
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Table 6 
Results of a Partially Nested Mixed-Model Analysis of Covariance on Students’ Posttests 


Test of —§ Curriculum 
Fraction -aligned 
TUF-4 TUF-5 Procedures Measure NLEO-1 NLE 0-2 


Fixed Intercept 1.46 -5.50* -14.92* 94 63.77**** 67.48**** 
Effects (2.48) (2.34) (6.15) (2.74) (6.22) (4.98) 
Condition 325"  236""" —TO.79"** -4,33""* 10:79" °6,.56"""" 
(Intervention) (.56) (.46) (1.37) (.51) (1.20) (.95) 
WRAT-4 .16**** 14**** .40**** 4 Ae*e* BEN haat 23**** 
Pretest (.02) (.02) (.06) (.02) (.06) (.05) 
NLE 0-1 14**** .Q9**** 31 kKkKKK 1 1 KKK =—'37**** -.85**** 
Pretest (.02) (.02) (.05) (.02) (.05) (.04) 
Variances Intervention 2.87* 1.00 16.95* 49 4.88 715 
Group (1.31) (77) (7.13) (.95) (4.91) (3.77) 
Intercept 
Residual 6.44**** 6.90**** = 39.33****  10.50**** 
(1.11) (.96) (6.33) (1.41) 
Intervention 38.21**** 44.57**** 
Residual (6.68) (7.69) 
Comparison 68.03**** 31.24**** 
Residual (12.03) (5.94) 
ICC Intervention 1 13 30 04 11 .02 
Groups 
Hedges' g Condition 0.784 0.660 1.068 1.055 1.083 0.796 
p-values Condition <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 
BH p- Condition <.0001 <.0001 <.0001 <.0001 <.0001  <.0001 
values 
of Condition 47 61 55 62 66 49 
Likelihood ratio x? 0.34 0.30 0.01 0.90 5.08 2.74 
p-values 558 585 919 342 .024 .098 


Note. Total sample size N = 186 students (I = 87, C = 99). Sample size for TUF-5, Test of 
Fraction Procedures, NLE 0-1, and NLE 0-2 is 185 students (| = 86, C = 99). Fixed effects and 
variances shown as parameter estimates with standard errors in parentheses. The models 
nested only intervention students within groups; comparison students were unclustered. ICCs 
estimated only for intervention students nested within instructional groups. The degrees of 
freedom (df) for tests of condition effects were based on the Satterthwaite approximation. 
Likelihood ratio tests, at bottom, compared homoscedastic residuals to heteroscedastic 
residuals with a criterion a of .20 and one degree of freedom. 

****Significant at p = .0001; ***significant at p = .001; **significant at p = .01; “significant at p = 
.05. 


39 


Table 7 
Student Posttest Adjusted Means 


Intervention Comparison 
(n= 87) (n= 99) 
Unadjusted Adjusted Unadjusted Unadjusted Adjusted Unadjusted 

Posttest Measures Mean Mean SD Mean Mean SD 
TUF-4 16.84 16.74 4.16 13.46 13.49 4.13 
TUF-5 8.36 8.28 3.92 5.86 5.92 3.25 
Test of Fraction 26.42 26.33 10.79 15.40 15.54 9.46 

Procedures 
Curriculum-aligned 16.16 16.13 4.66 11.77 11.80 3.55 

Measure 
NLE 0-1 90.65 90.76 7.53 80.03 79.97 11.67 
NLE 0-2 87.24 87.31 8.33 80.82 80.76 8.15 


Note. Total sample size N = 186 students (I = 87, C = 99). Sample size for TUF-5, Test of 
Fraction Procedures (Full), NLE 0-1, and NLE 0-2 posttests is 185 students (| = 86, C = 99). 


Sensitivity analyses. We used a partially nested model with intervention 
students nested within tutoring groups and unclustered comparison students as the 
main model for impact analysis. This test of condition did not involve any further nesting 
levels (blocking), such as teacher or school. As a sensitivity analysis, we ran two other 
models: (a) students nested within tutoring groups and unclustered comparison 
students blocked by teacher, and (b) students nested within tutoring groups and 
unclustered comparison students blocked by school. We also tested for students nested 
within tutors (instead of tutoring groups) and unclustered comparison students (no 
blocks). In addition, as the TUF-4 was given as both a pre- and posttest, we also 
examined the impacts from a gain score analysis for just the TUF-4 outcome. Finally, 
we tested the model without covariates. All these tests demonstrate the sensitivity of 
condition effects to the various methodological variations. As seen from Table 8, all the 
various analyses resulted in similar impacts. We found no meaningful differences in the 
results due to the differences in the analytical approach. All p-values and effect sizes 


were similar across models. Moreover, the models that blocked on teachers and 


40 


schools did not suggest any treatment-effect variation across the higher levels. 


Table 8 
Sensitivity of Main Effects to Analytic Approaches with Different Assumptions 
Analysis 
Analysis Approach? Unit Block Coefficient (SE) p-value Hedges’ g 
TUF-4 

ANCOVA Small - 3.25 (0.56) < .0001 0.784 
group? 

ANCOVA Small Teacher 3.47 (0.51) < .0001 0.84 
group 

ANCOVA Small School 3.26 (0.59) < .0001 0.79 
group 

ANCOVA Tutor - 3.45 (0.61) < .0001 0.83 

Gain Score® Small - 3.25 (0.73) < .0001 0.79 
group 
Small 

ANOVA Group - 3.29 (0.72) < .0001 0.79 

TUF-5 

ANCOVA Small - 2.36 (0.46) < .0001 0.660 
group? 

ANCOVA Small Teacher 2.36 (0.48) < .0001 0.66 
group 

ANCOVA Small School 2.29 (0.44) .0001 0.64 
group 

ANCOVA Tutor - 2.29 0.53) .0002 0.64 

ANOVA smal 7 2.48 0.64) 0004 0.69 
Group 

Test of Fraction Procedures 

ANCOVA Small - 10.79 (1.37) < .0001 1.068 
group? 

ANCOVA Small Teacher 10.89 (1.15) < .0001 1.08 
group 

ANCOVA Small School 10.99 (1.33) < .0001 1.09 
group 

ANCOVA Tutor - 10.58 (1.69) < .0001 1.05 
Small 

ANOVA Group - 11.02 (1.79) < .0001 1.09 

Curriculum-aligned Measure 

ANCOVA Small - 4.33 (0.51) < .0001 1.055 
group? 

ANCOVA Small Teacher 4.45 (0.48) < .0001 1.08 
group 

ANCOVA Small School 4.39 (0.55) < .0001 1.07 
group 

ANCOVA Tutor - 4.34 0.59) < .0001 1.06 
Small 

ANOVA group - 4.37 0.70) < .0001 1.07 


At 


Analysis 


Analysis Approach? Unit Block Coefficient (SE) p-value Hedges’ g 
NLE 0-1 

ANCOVA Small - 10.79 (1.20) < .0001 1.083 
group? 

ANCOVA Small Teacher 10.59 (1.27) < .0001 1.06 
group 

ANCOVA Small School 10.70 (1.25) < .0001 1.07 
group 

ANCOVA Tutor - 10.82 (1.17) < .0001 1.09 

Gain Score® Small - 12.51 (1.66) < .0001 1.26 
group 

ANOVA ie 7 10.64 (1.39) <.0001 ~—-1.07 
group 

NLE 0-2 

ANCOVA Small - 6.56 (0.95) < .0001 0.796 
group? 

ANCOVA Small § Teacher 6.57 (0.93) < .0001 0.80 
group 

ANCOVA Small School 6.56 (0.98) < .0001 0.80 
group 

ANCOVA with tutor as Tutor - 6.62 (0.75) < .0001 0.80 

cluster 

Small 

ANOVA group - 6.38 (1.25) < .0001 0.78 


aAll ANCOVA models included the WRAT-4 and NLE 0-1 variables as covariates. "Main model 
used in the impact analysis. ‘Time x Condition analysis. 
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Impact on performance assessments. Five performance-based assessments 
were administered throughout the course of the intervention to all participating 
intervention students and to a randomly selected sub-sample of 34 comparison 
students. To compare the relative performance of both student groups, we randomly 
selected 35 intervention students and compared their performance to that of the 34 
comparison students. As seen in Table 9, intervention students significantly 
outperformed comparison students on all five performance assessments (g = .68 to 
1.23), and p-values were all statistically significant after the Benjamini and Hochberg 


(1995) correction for multiple comparison.* 


3Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and 
powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 
Methodological, 57(1), 289-300. 
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Table 9 


Impact on Five Performance Assessments 


Intervention Comparison 
(n= 35) (n= 34) 
Mean Mean 
Possible raw raw p- Hedges’ 
Performance Assessment points score SD score SD _ ttest (df) value fe] 
1. Estimating a Fraction on 
a Number Line 
Composite Score 5 2.82 1.47 1.68 1.41 3.29(66) .002 .789° 
Item 1: Correct Answer 2 1.62 0.74 1.21 0.91 
Item 2: Explanation 3 1.21 0.98 0.47 0.79 
Number of Math -- 1.21 1.15 0.35 0.49 
Vocabulary Words used 
2. Ordering Fractions ona 
Number Line 
Composite Score 6 1.60 1.77 0.53 1.28 2.87(67) .006 .683” 
Item 1: Correct Answer 2 0.60 0.85 0.24 0.61 
Item 2: Explanation 4 1.00 1.21 0.29 0.80 
Number of Math -- 2.37 1.44 0.62 0.74 
Vocabulary Words used 
3. Fraction Addition Word 
Problem 
Composite Score 9 3.60 1.58 2.03 1.57 4.15(67) .000 .988” 
Item 1: Correct Answer 3 0.74 0.56 0.26 0.71 
Item 2: Explanation 5 1.89 1.13 0.85 0.93 
Item 3: Correct Operation 1 0.97 0.17 0.91 0.29 
Number of Math -- 1.86 1.09 0.47 0.66 
Vocabulary Words used 
4. Fraction Multiplication 
Word Problem 
Composite Score? 6 2.20 2.08 0.24 0.78 5.21 (44.2) .000 1.228 
Item 1: Correct Answer 1 0.57 0.50 0.06 0.24 
Item 2: Explanation 4 1.09 1.17 0.09 0.29 
Item 3: Correct Operation 1 0.54 0.51 0.09 0.29 
Number of Math -- 217 1.15 0.59 0:78 
Vocabulary Words used 
5. Fraction Subtraction 
Word Problem 
Composite Score? 8 3.24 2.20 1.38 1.30 4.22 (54.8) .000 1.012” 
Item 1: Correct Answer 2 0.56 0.82 0.09 0.38 
Item 2: Explanation 5 1.97 1.34 0.71 0.84 
Item 3: Correct Operation 1 0.71 0.46 0.59 0.50 
Number of Math -- 2.50 1.46 0.97 1.22 


Vocabulary Words used 


Note. Total analytic sample = 69 students (| = 35, C = 34). Sample size for Performance 
Assessments 1 and 5 is 68 students (I = 34, C = 34). *?Unequal variances; used two sample t- 
tests with Welch’s Approximation and corresponding Hedges’ g. ~ Significant at p = .001; 


“significant at p = .01. 
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Quality of explanations. Across all five performances, the explanations given by 
intervention students were more thorough than the explanations provided by 
comparison students. Intervention students’ written explanations included more math 
vocabulary words, more coherent rationales for their decisions and solutions methods, 
and details that demonstrated their understanding of the procedures they employed. For 
example, intervention students used an average of 2.02 relevant math vocabulary 
words, such as common denominator, simplify, and unit fraction, compared to an 
average of 0.6 words used by comparison students. The following is typical of the 
explanations that were provided by comparison students for the fifth performance 
assessment (Manny is training for a marathon. He ran 102 miles on Saturday. On 
Sunday, he ran 7=. miles. How many more miles did Manny run on Saturday than 
Sunday ?): “I just multiplyed the two number[s] together. Then | made sure that | did it 
right. And that is how | got my answer.” In contrast, the explanations provided by 
intervention students were more detailed, organized, and comprehensive (the 
underlined words are mathematical vocabulary words emphasized in the TransMath 
curriculum): 

For my answer | got 2= then | simplified it to 2 so my answer was a mixed 

number and 2-. The first thing | did was look for a clue word and that was how 

many more, so | knew | had to subtract. Next, | lined up my fractions and whole 
numbers correctly. Then, | checked if they had a common denominator, which 


they didn’t. Then | noticed 10 was a multiple of 5 and kept 7= then changed 102 
to 10=. Finally, | subtracted and saw that you couldn't subtract = minus =, so | 


regrouped. | regrouped by taking a whole away from ten and added = + = to get 
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= and change 10 into 9. Then | got the answer of 2= which is equivalent to 25. 

That’s how | got my answer. 
See Appendix A for additional examples of student written explanations. 
4.2 CLASS Observations of Tutoring Groups 

To better understand the facets of the learning environment that led to enhanced 
outcomes for this at-risk group of students, observations of the tutoring groups were 
conducted using the CLASS observational system (Pianta et al., 2012a). Instruction 
during the 35-minute TransMath fractions intervention lesson was rated on a 7-point 
Likert scale, where a rating of 1 or 2 is low range; 3, 4, 5 encompasses the middle 
range; and 6, 7 indicates a high performance level. Mean rating scores and the range of 
scores across the 21 tutoring groups are presented in Table 10. Mean ratings were in 
the high range only for mean Behavior Management and Productivity and in the high- 
middle range for Teacher Sensitivity, Instructional Formats, and Student Engagement. 

Interpreting ratings of tutoring groups. Table 10 also includes CLASS ratings 
from the Measures of Effective Teaching Study (MET), which included a nationwide 
sample of 1,333 teachers from grades 4-6; these ratings were obtained from the Upper 
Elementary and Secondary CLASS Technical Manual study (Pianta, Hamre, & Mintz, 
2012b). Note that the ratings from the MET study are not nationally representative but 
provide an example of scores from a large nationwide sample (i.e., a reference point), to 
facilitate comparison and an understanding of the ratings from this study. Also note that 
while the MET study observed videos of general education teachers frequently teaching 
whole classes of students, in the current study, tutors were observed working with small 
groups of approximately five students. 


Tutors received a higher rating than the MET teachers for Student Engagement 
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and the domain of Classroom Organization. They also scored higher in the Emotional 
Support domain for Positive Climate and Teacher Sensitivity; in the Classroom 
Organization domain for Behavior Management, Productivity, and Negative Climate; 
and in the Instructional Support domain for Instructional Learning Formats, Content 
Understanding, and Quality of Feedback. 

Tutors scored lower than the MET teachers on Regard for Student Perspectives, 
Analysis and Inquiry, and Instructional Dialogue, which is not surprising, given the 
context and goals of the small-group tutoring setting. Time during the 35-minute tutoring 
session was spent on four structured activities: review, explicit instruction, guided 
practice, and student explanations. Lessons progressed at a rapid pace, with tutors 
explicitly teaching the fractions content. Time was allocated for eliciting student 
explanations, but in general, the sessions were not designed for in-depth, rich content- 
focused discussions and back and forth extended exchanges between tutors and 
students (i.e., Instructional Dialogues). There were brief but meaningful structured peer 
interactions, but the sessions were also not planned to provide opportunities for student 
autonomy and leadership (i.e., Student Perspectives). Also, while the fractions 
intervention did emphasize problem solving, it did not provide for extensive student 
explorations of novel or open-ended problems, tasks, or questions, or for other types of 
higher-order thinking activities (i.e., Analysis and Inquiry). These types of activities 
typically occur during core mathematics instruction—when they do occur. Students in 


both conditions did receive core mathematics instruction. 
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Table 10 
CLASS Observation Ratings 


CLASS Rating: CLASS Rating: 
current study MET study* 
CLASS ltem Mean SD Range Mean SD Range 
Emotional Support 4.38 0.97 2.0-6.3 n/a n/a n/a 
Positive Climate 4.66 1.38 2-7 4.68 0.61 2.38-6.50 
Teacher Sensitivity 5.72 1.16 2-7 4.26 0.55 2.33-6.50 
Regard for Student Perspectives 2.75 0.92 2-5 3.29 0.60 1.38-5.18 
Classroom Organization 6.77 0.38 5.7-7.0 n/a n/a n/a 
Behavior Management 6.71 0.57 5-7 6.01 0.58 2.88-7.00 
Productivity 6.72 0.56 5-7 5.91 0.46 3.88-7.00 
Negative Climate 1.11 0.32 1-2 1.32 0.35 1.00-3.38 
Instructional Support 4.00 0.72 2.2-5.2 n/a n/a n/a 
Instructional Learning Formats 5.21 0.85 3-6 4.36 0.52 2.50-6.50 
Content Understanding 4.72 0.86 3-6 3.97 0.53 2.08-6.50 
Analysis and Inquiry 2.68 0.72 2-4 2.80 0.50 1.63-4.31 
Quality of Feedback 4.21 1.07 2-6 3.76 0.57 2.00-5.81 
Instructional Dialogue 3.17 0.78 1-4 3.51 0.56 2.00-5.50 
Student Engagement 5.74 1.18 3-7 5.08 0.48 2.94-6.50 


Note. Each of the 12 CLASS dimensions are rated on a 7-point Likert scale, with 1—2 indicating 
a low score, 3—5 being moderate, and 6-7 indicating a high score. Scoring for Negative Climate 
(Classroom Organization) is reversed; when calculating the domain total, it is necessary to 


subtract the average score for Negative Climate from 8. 


*Data obtained from the Upper Elementary and Secondary CLASS Technical Manual for (Pianta 
et al., 2012b). The Measures of Effective Teaching Study (MET) was conducted nationwide with 
a sample of 1,333 teachers in Grades 4-6. Each teacher was assessed using CLASS 

throughout one academic year four to eight times. n/a = not available. 


Correlations between CLASS ratings and student fractions outcomes. 


Correlations between CLASS ratings and student fractions outcomes for students in the 


intervention group are presented in Table 11. Three dimensions—Quality of Feedback, 


Instructional Dialogue, and Content Understanding—were significantly correlated with 


all measures of fraction achievement: TUF-4, TUF-5, Test of Fraction Procedures, and 


the Curriculum-aligned Measure. |Interestingly, none of the CLASS dimensions were 


correlated with the NLE measure. 
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Table 11 
CLASS Correlations with Student Fractions Outcomes 


Test of 
Fraction Curriculum NLE NLE 
TUF-4 TUF-5 Procedures -aligned 0-1 0-2 


CLASS Item Post Post Post Post Post Post 
Emotional Support -.035  .050 .043 .048 -.017  -.117 
Positive Climate -.040 .058 029 .056 .047 -.087 
Teacher Sensitivity -.116 .005 -.050 -.024 -.068 -.161 
Regard for Student Perspectives .096 .063 .158 .099 -.040 -.036 
Classroom Organization .116 211 .104 .152 -.008 -.015 
Behavior Management .138 .210 .098 .079 041 022 
Productivity .040 131 085 .176 -.068 .002 
Negative Climate =.091 =.136 -.041 -.083 O17 .093 
Instructional Support 194  .232° .233° .240° 026  ~.020 
Instructional Learning Formats -.023 .102 042 115 -.072 -.106 
Content Understanding 147.201 .237° .276" 025 .041 
Analysis and Inquiry .187 124 115 .116 .120 .057 
Quality of Feedback 253.265" .260° .217 .047 .056 
Instructional Dialogue .238 = 254° .299” .274" -.006 .037 
Student Engagement -.018 071 .001 .065 -.060 -.128 


Note. Total sample size N = 87 intervention students. Sample size for TUF-5, Test of Fraction 
Procedures, NLE 0-1, and NLE 0-2 is 86 intervention students. Each of the 12 CLASS 
dimensions are rated on a 7-point Likert scale, with 1—2 indicating a low score, 3—5 being 
moderate, and 6-7 indicating a high score. Scoring for Negative Climate (Classroom 
Organization) is reversed. Computed correlation used Pearson method with pairwise deletion. 
"Significant at p = .05; “significant at p = .01. 

Quality of Feedback addresses the extent to which tutor feedback increases and 
extends students’ understanding and learning and encourages students to participate. 
Ratings are based on behavior markers that reflect the extent to which tutor feedback 
enhances learning. The behavior markers include feedback loops (e.g., back-and-forth 
exchanges between tutor and student), scaffolding (e.g., hints and prompting), building 
on student responses (e.g., specific feedback, clarification), and encouragement and 
affirmation (e.g., recognizing and affirming student effort). 


Instructional Dialogue reflects the practice of purposeful discussions between 


tutor and students. The primary purpose of this dimension is to determine if connections 


49 


between and among ideas are made to enhance students’ understanding of the content 
at hand. Behavior markers include cumulative content-driven exchanges (e.g., depth of 
the exchanges, exchanges that build on one another rather than switch topics), 
distributed talk (e.g., a balance between tutor and student talk, majority of students have 
the opportunity to contribute), and facilitation strategies (e.g., use of open-ended 
questions and statements; acknowledging, repeating, and extending student 
responses). 

The middle to low-middle mean ratings for these two dimensions of instruction 
(Quality of Feedback = 4.21; Instructional Dialogue = 3.17) indicate that a modest 
amount of feedback and dialogue occurred during the intervention lessons. Most 
importantly, these levels of elegant feedback and instructional dialogue were 
significantly correlated with several of the student mathematics outcomes (with 
correlations ranging from .217 to .299; p< .05). 

These statistically significant associations were, in all likelinood, stimulated by 
the instructional design of the fractions intervention lessons. Specifically, throughout the 
lessons students are asked to explain the reasoning behind their answers, which lends 
itself to the type of feedback, prompting, and instructional dialogue measured by CLASS 
and recommended by contemporary state standards and the CCSS Practice Standards. 

For example, after explaining and modeling ways to add fractions with like 


denominators, the tutor directed students to write 7 a : on their whiteboard. Next, 


3 
4 
students were asked if the problem was solved correctly. Together, the students and 
tutor discussed how the equation could be “fixed.” After students worked in pairs to 


solve the problem, another discussion took place after students were asked, “Why is 


that correct and the first way incorrect?” Students discussed their responses using 
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either Cuisenaire Rods or a number line to illustrate their rationale. Clearly, the 
instructional dialogue and feedback generated by this type of interactive instruction was 
beneficial to the students and was significantly positively correlated with student 
performance on the fractions outcome measures. 

The average rating of 4.72 on the Content Understanding borders on the upper 
end of the middle range. Content Understanding was significantly correlated with one 
fractions posttest: the Test of Fraction Procedures (r= .237; p < .05). Content 
Understanding dimension captures the complexity of the lesson’s content and the 
instructional methods used to help students understand key attributes of the concepts. It 
focuses on building student understanding of the relationships among facts, skills, and 
concepts; establishing real-world connections; identifying essential components and 
communicating them through multiple examples and contrasting examples; linking to 
prior knowledge; and attending to misconceptions. 

For example, when students were initially taught to add fractions with unlike 
denominators, the concept was related to a previously learned skill, solving problems 
with like denominators. Thus, tutors addressed behavior markers such as tapping prior 
knowledge, providing contrasting examples, multiple and varied examples, and 
cumulative review, all of which assist in decreasing student misconceptions. The tutors 
provided explicit instruction in using the algorithm for solving addition problems 
containing fractions with unlike denominators. The instruction also included an 
explanation of why the algorithm works using visual representations. 

Classroom organization and student engagement. Observational research in 
general has shown both Classroom Organization and Student Engagement to be 


correlated with student outcomes during whole-class instruction (Foorman et al., 2006; 
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Gersten et al., 2009; Jayanthi et al., 2017). However, in this study, despite their high 
ratings, Classroom Organization and Student Engagement were not significantly 
correlated with student fraction outcomes. 

Classroom Organization received an exceptionally high mean rating of 6.77. 
During the 35-minute session, tutors managed time and instructional routines so that 
time allocated to instruction was maximized. It also indicates that the tutor set up clear 
expectations and managed student behavior effectively, and that there were very few 
incidences of unwanted negative behavior. Of course, this rating is easier to achieve in 
a small group of five or so than in a full class. Nonetheless, this finding suggests that 
the interventionists provided an organized learning environment for their students. 

The rating for Student Engagement was in a high-middle range (5.74), indicating 
that the majority of the students were engaged or on-task. However, there was 
variability in this dimension. Observation data indicated that some students were 
actively attending (e.g., answering questions, sharing ideas) and participating in the 
instructional activity, some were engaging passively (e.g., watching/listening to the 
tutor), and others were disengaged for part of the time. Although one would assume that 
student engagement would be easier to achieve in a small group of five than in a class 
of 30, that has not always been the case (e.g., Gersten, Carnine & Williams, 1982). 

4.3 Exploratory Analyses 

Moderator analysis. To determine if there were any differential impacts, student 
pretest scores (NLE 0-1, TUF-4, WRAT-4, Test of Fraction Procedures [Add/Sub]), 
demographic variables (gender, free and reduced lunch), and districts (Districts 1, 2, 
and 3) were examined as moderators of the relation between the fractions intervention 


and student achievement at posttest. There were no statistically significant tests of 
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moderation for gender. 

We found nine statistically significant moderators. As the moderator analyses are 
correlational and exploratory in nature, and not causal and confirmatory, we did not 
correct for multiple tests. 

Student pretest moderators. Treatment students outperformed comparison 
students on all outcome measures. Furthermore, student scores on the WRAT-4 pretest 
significantly moderated intervention effects on the Curriculum-aligned Measure scores 
(po = .0075), TUF-5 posttest scores (p = .0264), and the NLE 0-7 posttest scores (p = 
.0132). Results indicated that higher WWRAT-4 pretest scores were associated with a 
larger difference between treatment and comparison students’ scores on the 
Curriculum-aligned Measure and TUF-5 posttest. In contrast, lower WRAT-4 pretest 
scores were associated with a larger difference between treatment and comparison 
students’ scores on the NLE 0-1 posttest. 

Student scores on the NLE 0-17 pretest moderated the intervention effects on the 
Curriculum-aligned Measure scores (p = .0087) and the NLE 0-1 posttest scores (p = 
.0204). Higher performance on the NLE 0-7 pretest measure was associated with a 
larger difference between treatment and comparison students’ scores on the 
Curriculum-aligned Measure, while lower performance on the NLE 0-7 pretest measure 
was associated with a larger difference between treatment and comparison students’ 
scores on the NLE 0-1 posttest. 

Demographic moderators. Free and reduced-price lunch status also 
significantly moderated intervention effects on the TUF-4 (p = .0374) and NLE 0-17 (p= 
.0400) posttests. Although treatment students outperformed comparison students on 


both the TUF-4 pretest and the NLE 0-17 pretest regardless of free and reduced-priced 
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lunch status, the difference between treatment and comparison students’ scores on the 
TUF-4 posttest was more pronounced among students who were not receiving free and 
reduced-price lunch, compared with students who were receiving free or reduced-priced 
lunch. The opposite was true for NLE 0-1 post-test performance—there was a larger 
difference between treatment and comparison students’ scores on the NLE 0-17 posttest 
among students who were receiving free and reduced-price lunch, compared with 
students who were not receiving free and reduced-price lunch. 

School district moderators. Students in District 2 performed lower than 
students from other districts on all the pretest measures, including the TUF-4 screening 
measure (p = .0048, .0000, .0774, and .0032 for the TUF-4, WRAT-4, Test of Fraction 
Procedures [Add/Sub], and NLE 0-1 pretests, respectively). Meanwhile, students in 
District 3 performed higher than students from Districts 1 and 2 on all pretest measures 
(po = .0000, .0012, .0463, .0000 for the TUF-4, WRAT-4, Test of Fraction Procedures 
[Add/Sub], and NLE 0-1 pretests, respectively). Although treatment students from 
District 2 began the study at a lower achievement level than the other districts, their 
level of understanding grew at a faster rate than predicted over the course of the study. 
School district membership significantly moderated the size of the effect of the 
intervention on NLE 0-7 posttest scores. The difference between treatment and 
comparison students’ NLE 0-1 posttest scores was largest among students from District 
2 (District 2 vs. 1 and 3; p = .0181). In contrast, the gap between treatment and 
comparison students’ NLE 0-7 posttest scores was smallest among students from 
District 3 (District 3 vs. 1 and 2; p = .0213). This finding suggests that treatment 
students from the lowest-performing district seemed to benefit the most from the 


intervention. This finding could be related to the degree to which number lines were 


54 


used during core mathematics instruction. Teachers from District 3 reported that core 
classroom instruction incorporated the use of number lines to teach fraction content 
(i.e., magnitude, equivalence, comparing, and the four operations) more often than did 
teachers from District 2. Of the seven teachers from District 3, 57% reported using 
number lines to teach fraction magnitude, 100% reported using number lines to teach 
fraction comparing, and 71% reported using number lines to teach fraction addition and 
subtraction, compared to 45%, 73%, and 55% of the 11 teachers from District 2. 
Because students from District 2 had limited exposure to using number lines to 
understand fractions concepts during core mathematics instruction, the use of these 
materials during the intervention could have acted as a springboard, boosting these 
treatment students to much higher achievement levels than predicted. In contrast, 
students from District 3 were familiar with using number lines outside of the intervention, 
so the difference between treatment and comparison students’ scores was less 
pronounced. 

Mediator analysis. We tested whether the effects of condition on gains on TUF- 
4 were potentially mediated by gains on NLE 0-17. Figure 1 depicts an indirect-path 
model, the results of which offer evidence that supports mediation. The direct effect of 
condition on gains in TUF-4 were statistically significant, 3.34 [2.13, 4.55], an estimate 
very similar to the results in Tables 6, 7, and 8. With the indirect path a-b in the model, 
the effect of condition on gains in TUF-4 dropped to 0.84 and was no longer statistically 
significant (confidence bounds exclude zero). Thus, the results support the fully 
mediated effect of intervention versus comparison on students’ fraction understanding 
(i.e., TUF-4 gains) through improvement on the number line measure. Mediation models 


are, however, correlational and cannot support a directional or causal interpretation. 
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Figure 1 
Mediation Path Model 


Number Line 
Improvement 


Assignment to c’ = 0.84 [-3.16, 2.74] Fraction 
Intervention Understanding 
Improvement 


This mediation model demonstrates that after accounting for the indirect effects 
of condition on gains in fraction understanding through gains on the number line 
measure, the direct effects of condition on gains in fractions understanding become 
statistically nonsignificant. The model, therefore, supports the hypothesis of complete 
mediation by number line improvement. The paths provide the raw estimates and 95% 
confidence intervals. 

Pre- posttest correlations. Correlations between pre- and posttests are 
presented in Table 12. The WRAT-4, a measure of general mathematics achievement, 
was moderately and significantly correlated with all the posttests. The correlations 
ranged from .474 to .568, an optimal range for a covariate (Keppel & Zedeck, 1989). 
The NLE 0-7 pretest was also moderately and significantly correlated with all the 
posttests, with correlations ranging from .393 to .534. TUF-4 did not correlate as well 
with the posttests (r= .140 to .264), perhaps because it was used as a screener and 
therefore the scores of the students included in the study had a severely restricted 


range. 
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Table 12 


Correlations Between Pre- and Posttest Student Measures 


Test of 
Fraction Curriculum- 
TUF-4 TUF-5 Procedures’ aligned NLEO-1 NLE 0-2 


Post Post Post Post Post Post 
TUF-4 Pre 190" .140 204” 264" 211° 161° 
WRAT-4 Pre 568°” 560 561° 493° 474” ~=.505" 
Procedures (Add/Sub) Pre 307” 2607 332 258 223° .290° 
NLE 0-1 Pre 507° 429" 446" 393" A477” ~— 534" 


Note. Total sample size N = 186 students (I = 87, C = 99). Sample size for TUF-5, Test of 
Fraction Procedures, NLE 0-1, and NLE 0-2 is 185 students (| = 86, C = 99). Computed 
correlation used Pearson method with pairwise deletion. 

Significant at p = .001; “significant at p = .01; ‘significant at p = .05. 


4.4 Understanding Implementation 

Intended versus actual implementation of the fractions intervention. The 52 
TransMath fractions lessons were intended to be 35 minutes each and delivered 3 to 4 
times per week in small groups of 5 students. In actuality, the overall average session 
length was 34 minutes (median = 34:00 minutes; range = 14 to 54 minutes). Although 
every participating teacher originally consented to the 35-minute tutoring block, several 
teachers later insisted on restricting the lessons to a block of 25 to 30 minutes. This was 
mainly an issue at District 1 and at two schools in District 2. 

The actual mean number of sessions per week ranged from 2.5 to 3.1 lessons 
across the 21 tutoring groups (median ranged from 2 to 4 lessons per week). Most of 
the tutoring groups began the intervention on a 4-day schedule; however, several 
classroom teachers requested a 3-day tutoring schedule after the first few weeks of the 
intervention to accommodate unexpected schedule changes within their math 
departments. Tutoring lessons frequently had to be rescheduled, most frequently at 
District 2 due to an abundance of unanticipated field trips and school assemblies. In 


addition, the designated tutoring rooms were often unavailable at this district, and the 
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tutoring sessions had to be rescheduled. Unexpected school activities and room 
changes were not an issue in the schools in District 3, which had designated tutoring 
rooms available and intervention time blocks incorporated into the weekly school 
schedule. 

At randomization, each of the 21 tutoring groups consisted of 4 to 5 students; 
however, after attrition, group sizes ranged from 2 to 5 students. There was only one 
case in which a tutoring group had only 2 students. The median group size was 4 
students (mean = 4.24; SD = 0.89). 

Fidelity of implementation. For each of the 21 tutoring groups, fidelity was 
assessed for eight lessons (Lessons 4, 11, 20, 27, 31, 35, 43, and 47) by a member of 
the research team. Findings from both procedural fidelity as well as quality of instruction 
are described next. 

Procedural fidelity. Across the 21 tutoring groups (and across all eight lessons), 
on average, 78.05% of the steps were completed by the tutors (median = 81.08%; 
range = 69.57-85.07%). Of the 21 groups, eight groups completed over 80% of the 
steps (range = 80.14% to 92.91%; median = 88.08%), and 10 groups completed 
between 70 and 80% of the steps (range = 70.47 to 79.19; median = 79.31%). Fidelity 
for three groups was below 70% (range = 51.72% to 67.59%; median = 60.29%). 

When examined by lesson, the highest mean fidelity was recorded for Lesson 4: 
Representing Fractions with Cuisenaire Rods, with 85.07% of the steps completed. The 
lesson with the lowest mean procedural fidelity was Lesson 31: Multiplying Proper 
Fractions Using an Area Model (mean = 69.57%; range = 51.22-85.37%). Thirteen 
groups’ implementations of Lesson 31 had procedural fidelity less than or equal to 


75.00%. Four tutoring groups had procedural fidelity less than or equal to 60.00%. One 
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reason for this low fidelity for Lesson 31 might be related to the number of activities that 
had to be completed during this session. It was a complex lesson that focused on the 
key concept of fraction multiplication. It was also a dense lesson, with many key 
concepts and activities that had to be covered. Compared to the 37 steps that had to be 
implemented in Lesson 4, 82 steps had to be completed in Lesson 31. Perhaps it was a 
combination of these issues—difficult topic combined with a content-intense lesson— 
that made it difficult for the tutors to teach all the aspects of the lesson in the given time. 

Quality of instruction during implementation. Fidelity was also assessed by 
examining the quality of instruction provided in each tutoring group. Six items were 
rated on a 5-point Likert scale, with 1 being Low and 5 being High. Mean quality ratings 
for each instructional behavior are presented in Table 13. Overall, across the 21 groups 
and the eight fidelity lessons, the average quality rating was 3.99. Nine groups hada 
rating greater than 4 (median = 4.50). The rating for 11 groups was between 3 and 4 
(median = 4.00). Only one group received a rating of 2.86, indicating that in most of the 
groups, the quality of instruction provided by the tutors was above average. 

Overall quality ratings were lowest for Lesson 43: Converting Mixed Numbers to 
Fractions Over One (mean = 3.83; median = 4). For this lesson, groups received ratings 
as low as 1 and 2 on five of the six quality attributes. Lesson 20: Fractions That 
Represent the Same Number had the highest overall quality rating (mean = 4.12; 
median = 4.5). For this lesson, eight of the 21 tutoring groups received ratings above 4 
on all six quality attributes, and 12 of the 21 groups received ratings above 4 on at least 
five of the six quality attributes. 

Across all the groups and fidelity lessons, the behavior that received the lowest 


quality rating (a rating of 1 or 2) most often was providing specific math-oriented praise 
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(mean = 3.34; median = 3). The two instructional behaviors with the highest mean 
quality ratings were maintaining a positive rapport with the students (mean = 4.52) and 
using clear and mathematically correct language (mean = 4.34). 

As probing for understanding and facilitating student discussion were an 
important aspect of the intervention, the fidelity observers also rated each group on 
whether the tutor asked open-ended questions to probe student thinking and 
understanding before providing an answer or solution. The fidelity observers indicated 
that, across all tutoring groups and lessons, tutors probed for student understanding 
frequently 64% of the time and sometimes 30% of the time. 


Table 13 
Quality of Implementation Across All Eight Lessons for All 21 Tutoring Groups 


Fidelity Item Mean SD Range Median 
Observer's overall rating of the tutor’s 3.99 0.83 3.83-4.12 4.00 
implementation. 


Observer's perception of students’ grasp of the 4.08 0.85 3.98—4.17 4.00 
content. 


Tutor paces the lesson so that all parts of the 3.84 1.03 3.71-4.17 4.00 
session were covered in sufficient depth. 
Tutor uses clear and mathematically correct 4.34 0.74 4.14-4.62 4.50 
language. 
Tutor enhances students’ explanations. 4.13 0.90 3.90—-4.31 4.00 
Tutor provides specific math-oriented praise. 3.34 1.34 2./6-4.17 3.00 
Tutor Maintains a positive rapport with 4.52 0.75 4.36-4.76 5.00 
students. 
% % % 


Rarely Sometimes Frequently 


When students are explaining their answers, the tutor 6.02 30.12 63.86 
asks open-ended questions to probe thinking and 
understanding before providing an answer or solution. 
Note. Each of the quality of implementation items are rated on a 5-point Likert scale, with 1 
indicating a low score, 3 being moderate, and 5 indicating a high score. 


Reliability of the fidelity data. |Inter-rater reliability on fidelity scoring was 


assessed on 13.10% of the sessions (n = 22), by having two raters independently 
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assess the fidelity. The mean inter-rater reliability for the procedural fidelity was 81.57% 
(median = 81.03%). The mean inter-rater reliability for quality of implementation ratings 
was 75.97% (median = 85.71%). 

Correlation between fidelity and student fractions outcomes. Both 
procedural fidelity and quality of implementation ratings were moderately and 
significantly correlated with TUF-4, TUF-5, and the Test of Fraction Procedures (r= .372 
to .450, p < .001). See Table 14. Fidelity was correlated to a lesser extent, but still 
significantly at p < .01, with the Curriculum-aligned Measure (r= .324 and .271 for 
procedural fidelity and quality of implementation, respectively). While correlations 
between fidelity of implementation and NLE 0-7 were very low and non-significant, 
correlations with NLE 0-2 were low but significant at p < .05 (r= .242 and .246 for 


procedural and quality fidelity of implementation, respectively). 


Table 14 
Correlation Between Fidelity of Implementation and Student Fraction Outcomes 

Test of 

Fraction Curriculum 

TUF-4 TUF-5 Procedures’ -aligned NLEO-1 NLE 0-2 

Fidelity Post Post Post Post Post Post 
Procedural 379” 449” 450° 324" .043 242° 
Quality of Implementation .372 4237 414 271> 114 246° 


Note. Total sample size N = 87 intervention students. Sample size for TUF-5, Test of Fraction 
Procedures, NLE 0-1, and NLE 0-2 is 86 intervention students. 
“Significant at p = .001; “significant at p = .01; ‘significant at p = .05. 
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4.5 Findings from Surveys 

Classroom teacher survey of instructional practices during core 
mathematics instruction. Core classroom mathematics teachers varied in terms of 
when they started teaching fractions to their students. Variations were seen within and 
across districts. At District 1, 63.64% of teachers began teaching fractions in spring, and 
the rest started in fall and winter (September—December 2016 = 9.09%; January— 
February 2017 = 27.27%). In District 2, fraction instruction started at different time 
points in fall, with all teachers having begun by January 2017 (September 2016 = 
18.18%; November 2016 = 45.45%; December 2016 = 9.09%; January 2017 = 27.27%). 
In District 3, a majority of the teachers (85.71%) started teaching fractions in November 
and December 2016, and the rest (14.29%) started in January 2017. 

My Math (McGraw-Hill Education, 2017b) was used as the core curriculum in 
District 1, California Math (McGraw-Hill Education, 2017a) was used in District 2, 
and GO Math! (Houghton Mifflin Company, 2018) was used in District 3. Ninety percent 
of the teachers indicated that they used a district-adopted textbook; 83 percent used 
supplemental material including lessons/problems developed by the district (31.03%). 

To facilitate a comparison between the fractions intervention and the core 
classroom fractions instruction, teachers were asked several specific questions 
regarding the fractions topics they teach, the fractions operations they focus on, the 
representations and methods they use, and their expectations for their students 
regarding explanations and discussions. Responses to these questions are summarized 
in Table 15. Note that items that are starred in the table are also addressed in the 


TransMath fractions intervention provided to the students. 
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Table 15 
Classroom Teacher Survey of Core Mathematics Instruction 


Percentage of 


Survey Item Teachers 
| use the following representations to teach fractions: 
Concrete manipulatives* 75.86 
Visual representations* 96.55 
Number lines* 96:55 
| use number lines to teach the following fraction content: 
Fraction magnitude* 51.72 
Fraction equivalence* 96.55 
Fraction comparing” 86.21 
Fraction addition* 65.52 
Fraction subtraction* 65.52 
Fraction multiplication* 31.03 
Fraction division* 31.03 
| teach the following fraction operations: 
Add/subtract fractions or mixed numbers with unlike denominators* 96.55 
Multiply a fraction times a fraction” 96.55 
Divide a fraction by a fraction* 93.10 
Add/subtract mixed numbers that require whole-number regrouping* 89.66 
Multiply mixed numbers* 89.66 
| use the following methods for teaching students how to compare or 
order fractions when evaluating magnitude: 
Cross multiplying 79.31 
Number line* 93.10 
Benchmark fractions* 79.31 
Finding common denominators* 89.66 
Drawing a picture of each fraction 89.66 
Use of manipulatives* 82.76 
Thinking about the meaning of the numerator (part) and the 89.66 
denominator (whole)* ; 
Thinking about the relative size of the numerator compared to the 70.4 
denominator* 
| use the following methods when teaching students how to solve fraction 
word problems: 
Draw a picture 100.00 
Think about problem types 82.76 
Make a table 65.52 
Write an equation” 93.10 
Focus on key words 100.00 
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Never Rarely Sometimes Often Always 


| require students to explain their 0.00 3.45 24.14 48.28 24.14 
thinking after solving a problem.* 

| require students to explain using 0.00 0.00 6.90 51.72 41.38 
mathematically valid vocabulary.* 

| use prompt cards to help guide 55.,1:7 17.24 20.69 3.45 3.45 
students as they form explanations.* 

Students explain their thinking or 0.00 3.45 17.24 58.62 17.24 
understanding of fractions concepts 
orally.* 

Students explain their thinking or 0.00 3.45 17.24 58.62 17.24 


understanding of fractions concepts 
in writing.” 
Students get to work on novel word 6.90 10.34 55.17 24.14 0.00 
problem-solving activities related to 
fractions.* 
Note. Total Analytic Sample N = 29 teachers. Some percentages do not sum to 100 because 
some teachers selected more than one answer. Items that are starred in the table are also 
addressed in the TransMath fractions intervention provided to the students. 


Almost all classroom mathematics teachers reported using visual representations 
including number lines (96.55%); about three-fourths indicated they also used concrete 
manipulatives. Responses to the question regarding the use of number lines to teach 
fraction content were of particular interest as the number line was a critical tool used 
throughout the TransMath intervention. All but one teacher reported using a number 
line, but the extent of use varied dramatically. While a majority of the teachers used the 
number line to teach fraction equivalence (96.55%) and fraction comparison (86.21%), 
only 51.72% used it to teach fraction magnitude. Teachers were also using the number 
line to teach fractions operations. About two-thirds were using it to teach addition and 
subtraction (65.52%), but a much smaller percentage used number lines for division and 
multiplication of fractions (31.03%). 

Another important element of the fractions intervention was the emphasis on 
student explanations as a vehicle for determining student understanding. A majority of 


the core classroom mathematics teachers indicated that they required their students to 
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explain their thinking after solving a problem, but the frequency varied (always: 24.14; 
often: 48.28%; sometimes: 24.14%). As in the fractions intervention, they also required 
their students to explain using mathematically valid vocabulary (often: 51.72; always: 
41.38%). Teachers reported having students provide explanations orally as well as in 
writing. 

Another key aspect of the TransMath fractions intervention is the focus on novel 
problem-solving activities. Over 50% of the classroom mathematics teachers indicated 
that their students worked on novel problems some of the time; another 24.14% 
indicated that they often worked on such problems. 

One difference between the TransMath fractions intervention and core classroom 
mathematics instruction was the use of prompt cards to help guide students as they 
form their explanations. Most of the teachers reported never (55.17%) or rarely 
(17.24%) using the prompt cards. 

Instructional activities of comparison students. Classroom mathematics 
teachers were asked to provide a description of the activities their comparison students 
engaged in during the 35-minute intervention block when intervention students were 
receiving the fractions intervention. Students were engaged in a variety of instructional 
and intervention activities. See Table 16. As anticipated, very few teachers (17.24%) 


reported providing a structured mathematics intervention. 
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Table 16 
Activities of Comparison Students During the 35-minute Intervention Block 


What are the rest of your students doing while the selected students Percentage of 
go to fractions tutoring? Teachers 
Math intervention 17.24 
Reading intervention 17.24 
Core math 31.03 
Other instruction (ELA, Science, Social Studies) 72.41 
Other (P.E., homework review) 17.24 


Note. Total Analytic Sample N = 29 teachers. Percentages do not sum to 100 because some 
teachers selected more than one answer. 


Appraisal surveys. Findings from student, tutor, and classroom teacher 
appraisal surveys are presented below. 

Student survey. Virtually all of the fifth-graders (96.55%) indicated that they 
liked attending the fractions tutoring group all the time or some of the time. Over 85% 
indicated that they found fractions tutoring helpful; 73.56% also indicated that they had 
a better understanding of fractions after going to the tutoring group. In general, while 
students found both Cuisenaire Rods and number lines to be helpful, slightly more 
students found number lines to be more helpful (93.10%) than Cuisenaire Rods 
(86.21%). Most students (90.80%) indicated that they understood fraction addition well; 
in contrast, fewer students (54.02%) stated that they understood division of fractions. 
Fifty-seven percent of the students in the tutoring group indicated that they found 
fractions difficult; this is somewhat lower than what we had anticipated given their 
relatively low scores on the screening battery. See Table 17 fora summary of 


responses from the student survey. 
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Table 17 
Student Appraisal Survey 


Survey Item 


Percentage of Students 


Did you like going to your fractions 
tutoring group? 


No, | did not like 


Yes, most of going to fractions 
the time Yes, sometimes tutoring 
54.02 42.53 2.30 


Yes, sort of 
Yes, helpful helpful No, not helpful 
Did you find fractions tutoring helpful? 85.06 13.79 0.00 
Survey Item Percentage of Students 
Yes, sort of 
Yes, better better No, not better 
Do you understand fractions better now 73.56 25.29 0.00 
after going to your tutoring group? 
Yes, sort of 
Yes, difficult difficult No 
Do you find fractions difficult? 3.45 54.02 41.38 
Yes, sort of 
Yes, helpful helpful No, not helpful 
Cuisenaire Rods helped me understand 47.13 39.08 12.64 
fractions. 
Number lines helped me understand 55.17 37.93 5/5 
fractions. 
How well do you know understand these fraction topics? | know these well 
Equivalent fractions 60.92 
Adding fractions 90.80 
Subtracting fractions 79.31 
Multiplying fractions 74.71 
Dividing fractions 54.02 
Putting fractions on a number line 56.32 


Note. Total sample size N = 87 intervention students. Some percentages do not sum to 100 
because some students did not respond to all survey items. 


Tutor survey. All tutors indicated that their students had improved as result of 


tutoring. See Table 18. All also felt that their students appeared to be more confident 


and that they were able to articulate their understanding of fraction concepts. Seventy 


percent of the tutors also indicated that students were writing coherent explanations, 


understanding and using math vocabulary, and making fewer calculation errors. Only 
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40% indicated that they noticed students use number lines while solving problems. 
While most of the tutors (70%) felt that the TransMath curriculum was somewhat easy 
to follow, they acknowledged that the lessons in TransMath were very well organized 
(90%). Most tutors felt that working with five students per group was manageable (80% 
agree; 60% strongly agree); however, many also felt that it was difficult to meet 
students’ needs as they were at different levels (30% agree; 30% strongly agree). While 
80% of the tutors felt that the training provided by the research staff was sufficient, they 
identified areas where additional training would be useful: (a) more practice with 
Cuisenaire Rods, (b) strategies for dealing with disruptive students or helping English 
language learners with written explanations, and (c) examples of what a lesson should 
look like. 

When asked what the most difficult aspect of tutoring was, tutors raised issues 
such as having limited time to cover the material, having to deal with student behavior 
issues, catering to students at different levels, having uncooperative classroom 
teachers, and not having a reliable schedule and/or tutoring room. Tutors also 
suggested holding tutoring sessions when students are studying fractions in their 
regular classroom rather than before or after fractions have been taught, conducting 
tutoring sessions in the morning rather than in the afternoon, having smaller groups (3- 
4 students) to make sure all students are on track and to provide more individualized 
instruction, and excluding students with behavioral/emotional issues as they were 


unable to manage them. 
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Table 18 


Tutor Appraisal Survey 
Survey Item Percentage of Teachers 
Yes 
Do you think the students in your tutoring group(s) improved? 100.00 
What improvements did you notice in your students? 
Appear more confident 100.00 
Able to articulate their understanding of fraction concepts 100.00 
Coherent written explanations 70.00 
Understand and use math vocabulary 70.00 
Fewer errors in calculations 70.00 
Participate in classroom discussions 50.00 
Use number line while solving problems 40.00 
Always Mostly Somewhat _Not atall 
Did you feel the curriculum was easy 0.00 20.00 70.00 10.00 
to follow? 
Always Mostly Somewhat _Not atall 
Did you feel the lessons were well 0.00 90.00 10.00 0.00 
organized? 
Was the training provided by the 80.00 20.00 0.00 0.00 
research staff sufficient? 
Strongly Strongly 
Agree Agree Disagree Disagree 
Working with five students per group 60.00 30.00 10.00 0.00 
was manageable 
It was difficult to meet student needs 30.00 30.00 30.00 10.00 
as they were at different levels 
Student behavior problems took time 30.00 40.00 30.00 0.00 
away from teaching 
It was difficult to complete the lessons 30.00 20.00 40.00 10.00 
in the allocated time 
School staff were helpful and 40.00 60.00 0.00 0.00 
welcoming 
Teachers were friendly and 30.00 70.00 0.00 0.00 
cooperative 
Room provided for tutoring was 60.00 30.00 0.00 10.00 
satisfactory 
Definitely 
Yes Probably Maybe Definitely No 
Would you tutor again? 50.00 30.00 10.00 10.00 
Would you test again? 70.00 10.00 20.00 0.00 


Note. Total number of tutors = 10. 


Classroom teacher survey. Sixty percent of the classroom teachers indicated 


that their students (who received the fractions intervention) showed improved fraction 


69 


understanding. See Table 19. Another 36.67% indicated that their students had 
improved, but only minimally. Close to 70% of teachers felt that the fractions tutoring 
was beneficial or very beneficial for their students. When asked to compare the fractions 
performance of their intervention students with their class peers, 66.67% of the teachers 
indicated that the fractions performance was average; another 3.33% indicated that it 
was above average, and another 23.33% indicated that it was below class average. 

Like the tutors (but to a lesser extent) the teachers indicated that the students appeared 
more confident (75%), were participating in classroom discussions (59.38%), and were 


able to articulate their understanding of fractions concepts (46.88%). 


Table 19 
Classroom Teacher Appraisal Survey 
Survey Item Percentage of Teachers 
Very Somewhat Not at all 
beneficial Beneficial beneficial beneficial 
Overall, how beneficial was the 25.00 43.75 31.25 0.00 


fractions tutoring for your small 
group of students? 


Somewhat 
Above class At class below class Way below 
average average average _class average 
How is their fractions performance 3.33 66.67 23.33 6.67 
when compared to their peers? 
Somewhat 
Yes improved No 
Have these students shown an improvement in 60.00 36.67 3.33 
their understanding of fractions? 
Yes 
On average, what improvements relating to fractions have you 
seen for the students who participated in tutoring: 
Appear more confident 75.00 
Participate in classroom discussions 59.38 
Able to articulate their understanding of fraction concepts 46.88 
Fewer errors in calculations 28.13 
Use number line while solving problems 9.38 
If fractions tutoring were to be offered again next school year, 78.13 


would you allow your students to participate? 
Note. Total sample size N = 30-32 teachers because some teachers did not respond to all 
survey items. 
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5. Summary 

The purpose of this randomized controlled trial was to assess the impact of a 
fractions intervention on at-risk fifth-graders’ understanding of foundational fractions 
concepts as well as grade-level competence with fractions operations. The fractions 
intervention used 52 modified lessons from the TransMath curriculum. Each modified 
lesson was 35 minutes in duration. The intervention was provided 3—4 times a week. 

The impact of the fractions intervention was assessed using a randomized 
controlled trial. Findings indicate that students who received the TransMath fractions 
intervention performed significantly better on a range of fractions outcomes than 
students who did not receive the intervention. Effect sizes (Hedges’ g) ranged from .66 
to 1.08, and p-values were all less than .0001 even after the Benjamini-Hochberg 
correction. 

In addition, a series of five performance assessments were used to assess 
students’ understandings of the underlying concepts and their rationale for solving the 
problems. On a randomly selected sub-sample, intervention students performed 
significantly better than comparison students on all five performance assessments (g = 
.68 to 1.23); the impacts were all statistically significant at p = .01, even after correction 
for multiple comparison. 

These findings demonstrate that a fractions intervention that is grounded in a 
curriculum that consistently emphasizes use of the number line to build understanding 
of foundational concepts as well as operations and that emphasizes problem solving 
and provides opportunities to check for student understanding has the capacity to 
improve at-risk students’ grade-level fractions performance. The findings have a 


meaningful and practical value, given the well-established, predictive relationship 
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between knowledge of fractions at age 10 (i.e., 5'" grade) and performance in algebra 
and overall mathematics achievement in 11'" grade (Siegler et al. (2012). With a solid 
foundation in fractions and other rational number topics being a key predictor of success 
in algebra (e.g., Booth, Newton, & Twiss-Garrity, 2014; Geary, Hoard, Nugent, & Bailey, 
2012; Siegler et al., 2012), it is critical that students attain proficiency in fractions in 
upper-elementary grades before moving to more advanced mathematics in middle 


school. 
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Appendix A 


Figure A1 
Performance Assessment 17 Intervention Student Response 
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Figure A2 
Performance Assessment 1 Comparison Student Response 


How did you know where to place : on the number line? Explain your thinking. 
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Figure A3 
Performance Assessment 2 Intervention Student Response 
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Figure A4 
Performance Assessment 2 Comparison Student Response 
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How did you know where to place each fraction? Explain your Thinking. 
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Figure A5 
Performance Assessment 3 Intervention Student Response 


Explain your thinking. 
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Figure A6 
Performance Assessment 3 Comparison Student Response 
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Figure A7 
Performance Assessment 4 Intervention Student Response 
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Figure A8 
Performance Assessment 4 Comparison Student Response 


Pyachon. 


A-8 


Figure AQ 
Performance Assessment 5 Intervention Student Response 


Explain your thinking. 
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Figure A10 
Performance Assessment 5 Comparison Student Response 
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Appendix B 


Figure B1 
Curriculum-aligned Posttest Moderated by WRAT-4 Pretest 
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Figure B2 
TUF-5 Posttest Moderated by WRAT-4 Pretest 
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Figure B3 
NLE 0-1 Posttest Moderated by WRAT-4 Pretest 
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Note. NLE 0-1 and NLE 0-2 are reported in terms of percent absolute error (i.e., a ow score 
indicates high performance). 


Figure B4 
Curriculum-aligned Posttest Moderated by NLE 0-1 Pretest 
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Note. NLE 0-1 and NLE 0-2 are reported in terms of percent absolute error (i.e., a ow score 
indicates high performance). 
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Figure B5 
NLE 0-1 moderated by NLE 0-1 pretest 
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for NLE 0-1 Posttest 


20 30 40 
Pretest NLE 0-1 


Note. NLE 0-1 and NLE 0-2 are reported in terms of percent absolute error (i.e., a ow score 


indicates high performance). 


Table B1 
NLE 0-1 Posttest Moderated by District 2 vs. Not (Districts 1 and 3) 


Effect Estimate Lower Upper t (df) p 
I-C District2 -13.902 -17.662 -10.142 -7.34 < .0001 
I-C Not -7.476 =11.216 =3:736 -3.96 .001 
I-C District2—Not -6.426 -11.730 =1.123 -2.40 0181 


Note. Total Analytic Sample N = 186 students (I = 87, C = 99). NLE 0-7 and NLE 0-2 are 
reported in terms of percent absolute error (i.e., a ow score indicates high performance). 


Table B2 
NLE 0-1 Posttest Moderated by District 3 vs. Not (Districts 1 and 2) 


Effect Estimate Lower Upper t (df) p 

I-C District3 -4.289 -10.190 1.612 -1.44 .1526 
I-C Not -12.001 -14.828 -9.174 -8.42 < .0001 
I-C District3—Not T7113 1.170 14.256 2.34 0213 


Note. Total Analytic Sample N = 186 students (I = 87, C = 99). NLE 0-7 and NLE 0-2 are 
reported in terms of percent absolute error (i.e., a ow score indicates high performance). 
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Table B3 
TUF-4 Posttest Moderated by Free and Reduced-Price Lunch (FRL) vs. Not 


Effect Estimate Lower Upper t (df) p 

I-C FRL 2.103 0.583 3.622 2.77 .0075 
I-C Not 4.884 2.763 7.006 4.59 < .0001 
I-C FRL-Not -2.782 -5.399 -0.165 -2.11 0374 


Note. Total Analytic Sample N = 186 students (I = 87, C = 99). 


Table B4 

NLE 0-1 Posttest Moderated by Free and Reduced-Price Lunch (FRL) vs. Not 

Effect Estimate Lower Upper t (df) p 
I-C FRL -12.418 -15.866 -8.970 -7.14 < .0001 
I-C Not -6.231 -10.965 -1.496 -2.61 .0104 
I-C FRL—Not -6.188 -12.087 -0.288 -2.08 .0400 


Note. Total Analytic Sample N = 186 students (I = 87, C = 99). NLE 0-1 and NLE 0-2 are 
reported in terms of percent absolute error (i.e., a ow score indicates high performance). 


